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About This Book 


The primary objective of this manual is to help programmers provide software that is 
compatible across the family of 32-bit PowerPC™ processors. Because the PowerPC 
architecture is designed to be flexible to support a broad range of both 32 and 64-bit 
processors, this book provides a general description of features that are common to 
PowerPC processors and indicates those features that are optional or that may be 
implemented differently in the design of each processor. 

This book is a revision of an earlier document titled: “ PowerPC Microprocessor Family: 
The Programming Environments'’'’ which describes both the 64- and the 32-bit versions of 
the PowerPC architecture. The information in this manual defines only the 32-bit 
version of the architecture. There is also a related document titled: “ PowerPC 
Microprocessor Family: The Programming Environments for 32-Bit Microprocessors” 
which was developed by Motorola. Both books describe the 32-bit version of the PowerPC 
architecture and reflect changes to the PowerPC architecture made subsequent to the 
publication of “PowerPC Microprocessor Family: The Programming Environments” , Rev. 
0 and Rev. 0. 1 . 

To locate any published errata or updates for this and other documents, refer to the world- 
wide web at http://www.chips.ibm.com/products/ppc, or at http://www.mot.com/powerpc/. 

For designers working with a specific processor, this book should be used in conjunction 
with the user’s manual for that processor. For information regarding variances between a 
processor implementation and the version of the PowerPC architecture reflected in this 
document, see the reference to Implementation Variances Relative to Rev. 1 of The 
Programming Environments Manual described in “PowerPC Documentation,” on Page 
xxix. 


This document distinguishes between the three levels, or programming environments, of 
the PowerPC architecture, which are as follows: 


• PowerPC user instruction set architecture (UISA) — The UISA defines the level of 
the architecture to which user-level software should conform. The UISA defines the 
base user-level instruction set, user-level registers, data types, memory conventions, 
and the memory and programming models seen by application programmers. 

• PowerPC virtual environment architecture (VEA) — The VEA, which is the smallest 
component of the PowerPC architecture, defines additional user-level functionality 
that falls outside typical user-level software requirements. The VEA describes the 
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I memory model for an environment in which multiple processors or other devices 
can access external memory, and defines aspects of the cache model and cache 
control instructions from a user-level perspective. The resources defined by the VEA 
are particularly useful for optimizing memory accesses and for managing resources 
in an environment in which other processors and other devices can access external 
memory. 

Implementations that conform to the PowerPC VEA also adhere to the UISA, but 
may not necessarily adhere to the OEA. 

• PowerPC operating environment architecture (OEA) — The OEA defines supervisor- 
level resources typically required by an operating system. The OEA defines the 
PowerPC memory management model, supervisor-level registers, and the exception 
model. 

Implementations that conform to the PowerPC OEA also conform to the PowerPC 
UISA and VEA. 

It is important to note that some resources are defined more generally at one level in the 
architecture and more specifically at another. For example, conditions that can cause a 
floating-point exception are defined by the UISA, while the exception mechanism itself is 
defined by the OEA. 

Because it is important to distinguish between the levels of the architecture in order to 
ensure compatibility across multiple platforms, those distinctions are shown clearly 
throughout this book. The level of the architecture to which text refers is indicated in the 
outer margin, using the conventions shown in “Conventions,” on Page xxxi. 

This book does not attempt to replace the PowerPC architecture specification, which 
defines the architecture from the perspective of the three programming environments and 
which remains the defining document for the PowerPC architecture. This book reflects 
changes made to the architecture before August 6, 1996. These changes are described in 
Section 1.3, “Changes to this Document.” For information about the architecture 
specification, see “General Information,” on Page xxviii. 

For ease in reference, this book and the processor user’s manuals have arranged the 
architecture information into topics that build upon one another, beginning with a 
description and complete summary of registers and instructions (for all three environments) 
and progressing to more specialized topics such as the cache, exception, and memory 
management models. As such, chapters may include information from multiple levels of 
the architecture; for example, the discussion of the cache model uses information from both 
the VEA and the OEA. 

It is beyond the scope of this manual to describe individual PowerPC processors. It must be 
kept in mind that each PowerPC processor is unique in its implementation of the PowerPC 
architecture. 

The information in this book is subject to change without notice, as described in the 
disclaimers on the title page of this book. As with any technical documentation, it is the 


XXVI 


PowerPC Microprocessor Family: The Programming Environments 



readers’ responsibility to be sure they are using the most recent version of 
the documentation. For more information, contact your sales representative. 

Audience 

This manual is intended for system software and hardware developers and application 
programmers who want to develop products for the 32-bit PowerPC processors. It is 
assumed that the reader understands operating systems, microprocessor system design, and 
the basic principles of RISC processing. 

This book describes only the 32-bit portions of the PowerPC architecture. The information 
in this manual is also presented separately in PowerPC Microprocessor Family: The 
Programming Environments for 32-Bit Microprocessors. 

Organization 

Following is a summary and a brief description of the major sections of this manual: 

• Chapter 1, “Overview,” is useful for those who want a general understanding of the 
features and functions of the PowerPC architecture. This chapter describes the 
flexible nature of the PowerPC architecture definition and provides an overview of 
how the PowerPC architecture defines the register set, operand conventions, 
addressing modes, instruction set, cache model, exception model, and memory 
management model. 

• Chapter 2, “PowerPC Register Set,” is useful for software engineers who need to 
understand the PowerPC programming model for the three programming 
environments and the functionality of the PowerPC registers. 

• Chapter 3, “Operand Conventions,” describes PowerPC conventions for storing data 
in memory, including information regarding alignment, single- and double- 
precision floating-point conventions, and big- and little-endian byte ordering. 

• Chapter 4, “Addressing Modes and Instruction Set Summary,” provides an overview 
of the PowerPC addressing modes and a description of the PowerPC instructions. 
Instructions are organized by function. 

• Chapter 5, “Cache Model and Memory Coherency,” provides a discussion of the 
cache and memory model defined by the VEA and aspects of the cache model that 
are defined by the OEA. 

• Chapter 6, “Exceptions,” describes the exception model defined in the OEA. 

• Chapter 7, “Memory Management,” provides descriptions of the PowerPC address 
translation and memory protection mechanism as defined by the OEA. 

• Chapter 8, “Instruction Set,” functions as a handbook for the PowerPC instruction 
set. Instructions are sorted by mnemonic. Each instruction description includes the 
instruction formats and an individualized legend that provides such information as 
the level(s) of the PowerPC architecture in which the instruction may be found and 
the privilege level of the instruction. 
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I • Appendix A, “PowerPC Instruction Set Listings,” lists all the PowerPC instructions. 
Instructions are grouped according to mnemonic, opcode, function, and form. 

• Appendix B, “POWER Architecture Cross Reference,” identifies the differences 
that must be managed in migration from the POWER architecture to the PowerPC 
architecture. 

• Appendix C, “Multiple-Precision Shifts,” describes how multiple-precision shift 
operations can be programmed as defined by the UISA. 

• Appendix D, “Floating-Point Models,” gives examples of how the floating-point 
conversion instructions can be used to perform various conversions as described in 
the UISA. 

• Appendix E, “Synchronization Programming Examples,” gives examples showing 
how synchronization instructions can be used to emulate various synchronization 
primitives and how to provide more complex forms of synchronization. 

• Appendix F, “Simplified Mnemonics,” provides a set of simplified mnemonic 
examples and symbols. 

• This manual also includes a glossary and an index. 

Suggested Reading 

This section lists additional reading that provides background for the information in this 
manual as well as general information about the PowerPC architecture. 

General Information 

The following documentation provides useful information about the PowerPC architecture 
and computer architecture in general: 

• The following books are available from the Morgan-Kaufmann Publishers, 340 Pine 
Street, Sixth Floor, San Francisco, CA 94104; Tel. (800) 745-7323 (U.S.A.), (415) 
392-2665 (International); internet address: mkp@mkp.com. 

— The PowerPC Architecture: A Specification for a New Family of RISC 
Processors, Second Edition, by International Business Machines, Inc. 

Updates to the architecture specification are accessible via the world-wide web 
at http://www.austin.ibm.com/tech/ppc-chg.html. 

— PowerPC Microprocessor Common Hardware Reference Platform: A System 
Architecture, by Apple Computer, Inc., International Business Machines, Inc., 
and Motorola, Inc. 
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— Macintosh Technology in the Common Hardware Reference Platform, by Applel 
Computer, Inc. 

— Computer Architecture: A Quantitative Approach, Second Edition, by 
John L. Hennessy and David A. Patterson, 

• Inside Macintosh: PowerPC System Software, Addison- Wesley Publishing 
Company, One Jacob Way, Reading, MA, 01867; Tel. (800) 282-2732 (U.S.A.), 
(800) 637-0029 (Canada), (716) 871-6555 (International). 

• PowerPC Programming for Intel Programmers, by Kip McClanahan; IDG Books 
Worldwide, Inc., 919 East Hillsdale Boulevard, Suite 400, Foster City, CA, 94404; 
Tel. (800) 434-3422 (U.S.A.), (415) 655-3022 (International). 

PowerPC Documentation 

The PowerPC documentation is organized in the following types of documents: 

• User’s manuals — These books provide details about individual PowerPC 
implementations and are intended to be used in conjunction with The Programming 
Environments Manual. These include the following: 

— PowerPC 601™ RISC Microprocessor User’s Manual: (IBM order # 

5 2G7484/(MPR60 1 UMU-02) 

— PowerPC 602™ RISC Microprocessor User’s Manual: (IBM order 
#MPR602UM-0 1 ) 

— PowerPC 603e™ RISC Microprocessor User’s Manual with Supplement for 
PowerPC 603 Microprocessor: (IBM order #MPR603EUM-01) 

— PowerPC 604™ RISC Microprocessor User’s Manual: 

(IBM order #MPR604UMU-01) 

• The PowerPC Microprocessor Family: The Programming Environments, 
provides information about resources defined by the PowerPC architecture that are 
common to PowerPC processors. This document describes both the 32- and 64-bit 
portions or the architecture. 

• Implementation Variances Relative to Rev. 1 of The Programming Environments 
Manual is available via the world-wide web at 
http://www.chips.ibm.com/products/ppc. 

• Addenda/errata to user’s manuals — Because some processors have follow-on parts 
an addendum is provided that describes the additional features and changes to 
functionality of the follow-on part. These addenda are intended for use with the 
corresponding user’s manuals. These include the following: 

— Addendum to PowerPC 603e RISC Microprocessor User’s Manual: PowerPC 
603e Microprocessor Supplement and User’s Manual Errata: (IBM order # 
SA14-2034-00) 

— Addendum to PowerPC 604 RISC Microprocessor User’s Manual: PowerPC 
604e™ Microprocessor Supplement and User’s Manual Errata: (IBM order # 
SA14-2056-01) 
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• Hardware specifications — Hardware specifications provide specific data regarding 
bus timing, signal behavior, and AC, DC, and thermal characteristics, as well as 
other design considerations for each PowerPC implementation. These include the 
following: 

— PowerPC 601 RISC Microprocessor Hardware Specifications : 

(IBM order # MPR601 HSU-03) 

— PowerPC 602 RISC Microprocessor Hardware Specifications: 

(IBM order # SC229897-00) 

— PowerPC 603 RISC Microprocessor Hardware Specifications: 

(IBM order # MPR603 HSU-03) 

— PowerPC 603e RISC Microprocessor Family: PID6-603e Hardware 
Specifications: (IBM order # G5 22-0268-00) 

— PowerPC 603e RISC Microprocessor Family: PID7V-603e Hardware 
Specifications: (IBM order # G5 22-0267-00) 

— PowerPC 604 RISC Microprocessor Hardware Specifications: 

(IBM order #MPR604HSU-02) 

— PowerPC 604e RISC Microprocessor Family: PID9V-604e Hardware 
Specifications: (IBM order # S A 14-2054-00) 

• Technical Summaries — Each PowerPC implementation has a technical summary 
that provides an overview of its features. This document is roughly the equivalent to 
the overview (Chapter 1) of an implementation user’s manual. Technical summaries 
are available for the 601, 602, 603, 603e, 604, and 604e as well as the following: 

— PowerPC 620™ RISC Microprocessor Technical Summary: (IBM order # SA14- 
2069-01) 

• PowerPC Microprocessor Family: The Bus Interface for 32-Bit Microprocessors: 
(IBM order # G522-0291-00) provides a detailed functional description of the 60x 
bus interface, as implemented on the 601, 603, and 604 family of PowerPC 
microprocessors. This document is intended to help system and chipset developers 
by providing a centralized reference source to identify the bus interface presented by 
the 60x family of PowerPC microprocessors. 

• PowerPC Microprocessor Family: The Programmer’s Reference Guide: (IBM order 
# MPRPPCPRG-01) is a concise reference that includes the register summary, 
memory control model, exception vectors, and the PowerPC instruction set. 

• PowerPC Microprocessor Family: The Programmer’s Pocket Reference Guide: 
(IBM order # SA14-2093-00): This foldout card provides an overview of the 
PowerPC registers, instructions, and exceptions for 32-bit implementations. 

• Application notes — These short documents contain useful information about 
specific design issues useful to programmers and engineers working with PowerPC 
processors. 
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• Documentation for support chips — These include the following: 

— MPC105 PCI Bridge/Memory Controller User’s Manual: 

MPC105UM/AD (Motorola order #) 

— MPC106 PCI Bridge/Memory Controller User’s Manual: 

MPC106UM/AD (Motorola order #) 

Additional literature on PowerPC implementations is being released as new processors 
become available. For a current list of PowerPC documentation, refer to the world-wide 
web at http://www.chips.ibm.com/products/ppc or at http://www.mot.com/powerpc/. 


Conventions 


This document uses the following notational conventions: 

mnemonics Instruction mnemonics are shown in lowercase bold. 


italics Italics indicate variable command parameters, for example, bcctrx 

Book titles in text are set in italics. 

0x0 Prefix to denote hexadecimal number 


ObO 
rA, rB 
rD 

frA, frB, frC 
frD 

REG [FIELD] 


x 


n 

—\ 

& 




Prefix to denote binary number 

Instruction syntax used to identify a source GPR 

Instruction syntax used to identify a destination GPR 

Instruction syntax used to identify a source FPR 

Instruction syntax used to identify a destination FPR 

Abbreviations or acronyms for registers are shown in uppercase text. 
Specific bits, fields, or ranges appear in brackets. For example, 
MSR[LE] refers to the little-endian mode enable bit in the machine 
state register. 

In certain contexts, such as a signal encoding, this indicates a don’t 
care. 

Used to express an undefined numerical value 
NOT logical operator 
AND logical operator 
OR logical operator 

This symbol identifies text that is relevant with respect to the 
PowerPC user instruction set architecture (UISA). This symbol is 
used both for information that can be found in the UISA specification 
as well as for explanatory information related to that programming 
environment. 

This symbol identifies text that is relevant with respect to the 
PowerPC virtual environment architecture (VEA). This symbol is 
used both for information that can be found in the VEA specification 


About This Book 


XXXI 



0 



000 0 


as well as for explanatory information related to that programming 
environment. 

This symbol identifies text that is relevant with respect to the 
PowerPC operating environment architecture (OEA). This symbol is 
used both for information that can be found in the OEA specification 
as well as for explanatory information related to that programming 
environment. 

Indicates reserved bits or bit fields in a register. Although these bits 
may be written to as either ones or zeroes, they are always read as 
zeros. 


Additional conventions used with instruction encodings are described in Table 8-2 on page 
8-2. Conventions used for pseudocode examples are described in Table 8-3 on page 8-4. 


Acronyms and Abbreviations 

Table i contains acronyms and abbreviations that are used in this document. Note that the 
meanings for some acronyms (such as SDR1 and XER) are historical, and the words for 
which an acronym stands may not be intuitively obvious. 


Table i. Acronyms and Abbreviated Terms 


Term 

Meaning 

ALU 

Arithmetic logic unit 

BAT 

Block address translation 

BIST 

Built-in self test 

BPU 

Branch processing unit 

BUID 

Bus unit ID 

CR 

Condition register 

CTR 

Count register 

DABR 

Data address breakpoint register 

DAR 

Data address register 

DBAT 

Data BAT 

DEC 

Decrementer register 

DSISR 

Register used for determining the source of a DSI exception 

DTLB 

Data translation lookaside buffer 

EA 

Effective address 

EAR 

External access register 

ECC 

Error checking and correction 

FPECR 

Floating-point exception cause register 
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Table i. Acronyms and Abbreviated Terms (Continued) 


0 


Term 

Meaning 

FPR 

Floating-point register 

FPSCR 

Floating-point status and control register 

FPU 

Floating-point unit 

GPR 

General-purpose register 

IBAT 

Instruction BAT 

IEEE 

Institute of Electrical and Electronics Engineers 

ITLB 

Instruction translation lookaside buffer 

IU 

Integer unit 

L2 

Secondary cache 

LIFO 

Last-in-first-out 

LR 

Link register 

LRU 

Least recently used 

LSB 

Least-significant byte 

Isb 

Least-significant bit 

MESI 

Modified/exclusive/shared/invalid — cache coherency protocol 

MMU 

Memory management unit 

MSB 

Most-significant byte 

msb 

Most-significant bit 

MSR 

Machine state register 

NaN 

Not a number 

NIA 

Next instruction address 

No-op 

No operation 

OEA 

Operating environment architecture 

PIR 

Processor identification register 

PTE 

Page table entry 

PTEG 

Page table entry group 

PVR 

Processor version register 

RISC 

Reduced instruction set computing 

RTL 

Register transfer language 

RWITM 

Read with intent to modify 

SDR1 

Register that specifies the page table base address for virtual-to-physical address translation 

SIMM 

Signed immediate value 
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Table i. Acronyms and Abbreviated Terms (Continued) 


Term 

Meaning 

SLB 

Segment lookaside buffer 

SPR 

Special-purpose register 

SPRGn 

Registers available for general purposes 

SR 

Segment register 

SRRO 

Machine status save/restore register 0 

SRR1 

Machine status save/restore register 1 

STE 

Segment table entry 

TB 

Time base register 

TLB 

Translation lookaside buffer 

UIMM 

Unsigned immediate value 

UISA 

User instruction set architecture 

VA 

Virtual address 

VEA 

Virtual environment architecture 

XATC 

Extended address transfer code 

XER 

Register used primarily for indicating conditions such as carries and overflows for integer operations 
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Terminology Conventions 1 

Table ii lists certain terms used in this manual that differ from the architecture terminology 
conventions. 


Table ii. Terminology Conventions 


The Architecture Specification 

This Manual 

Data storage interrupt (DSI) 

DSI exception 

Extended mnemonics 

Simplified mnemonics 

Instruction storage interrupt (ISI) 

ISI exception 

Interrupt 

Exception 

Privileged mode (or privileged state) 

Supervisor-level privilege 

Problem mode (or problem state) 

User-level privilege 

Real address 

Physical address 

Relocation 

Translation 

Storage (locations) 

Memory 

Storage (the act of) 

Access 


Table iii describes instruction field notation conventions used in this manual. 

Table iii. Instruction Field Conventions 
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Chapter 1. Overview 

The PowerPC™ architecture provides a software model that ensures software compatibility 
among implementations of the PowerPC family of microprocessors. In this document, and 
in other PowerPC documentation as well, the term ‘implementation’ refers to a hardware 
device (typically a microprocessor) that complies with the specifications defined by the 
architecture. 

The PowerPC architecture was originally defined as a 32-bit architecture and was later 
extended to 64-bits. The 32 and 64 pertains to the size of the integer register width and it’s 
supporting registers. In both implementations the floating point registers have always been 
64 bits. This book describes the 32 bit option only and is a subset of the document: 
“ PowerPC Microprocessor Family: The Programming Environments ”. 

In general, the architecture defines the following: 

• Instruction set — The instruction set specifies the families of instructions (such as 
load/store, integer arithmetic, and floating-point arithmetic instructions), the 
specific instructions, and the forms used for encoding the instructions. The 
instruction set definition also specifies the addressing modes used for accessing 
memory. 

• Programming model — The programming model defines the register set and the 
memory conventions, including details regarding the bit and byte ordering, and the 
conventions for how data (such as integer and floating-point values) are stored. 

• Memory model — The memory model defines the size of the address space and of the 
subdivisions (pages and blocks) of that address space. It also defines the ability to 
configure pages and blocks of memory with respect to caching, byte ordering (big- 
or little-endian), coherency, and various types of memory protection. 

• Exception model — The exception model defines the common set of exceptions and 
the conditions that can generate those exceptions. The exception model specifies 
characteristics of the exceptions, such as whether they are precise or imprecise, 
synchronous or asynchronous, and maskable or nonmaskable. The exception model 
defines the exception vectors and a set of registers used when exceptions are taken. 
The exception model also provides memory space for implementation- specific 
exceptions. ( 

NOTE: Exceptions are referred to as interrupts in the architecture specification. 
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• Memory management model — The memory management model defines how 
memory is partitioned, configured, and protected. The memory management model 
also specifies how memory translation is performed, the real, virtual, and physical 
address spaces, special memory control instructions, and other characteristics. 
(Physical address is referred to as real address in the architecture specification.) 

• Time-keeping model — The time-keeping model defines facilities that permit the 
time of day to be determined and the resources and mechanisms required for 
supporting time-related exceptions. 

These aspects of the PowerPC architecture are defined at different levels of the architecture, 
and this chapter provides an overview of those levels — the user instruction set architecture 
(UISA), the virtual environment architecture (VEA), and the operating environment 
architecture (OEA). 

To locate any published errata or updates for this document, refer to the website at 
http://www.mot.com/powerpc/ or at http://www.chips.ibm.com/products/ppc. 

1.1 PowerPC Architecture Overview 

The PowerPC architecture, developed jointly by Motorola, IBM, and Apple Computer, is 
based on the POWER architecture implemented by RS/6000™ family of computers. The 
PowerPC architecture takes advantage of recent technological advances in such areas as 
process technology, compiler design, and reduced instruction set computing (RISC) 
microprocessor design to provide software compatibility across a diverse family of 
implementations, primarily single-chip microprocessors, intended for a wide range of 
systems, including battery-powered personal computers; embedded controllers; high-end 
scientific and graphics workstations; and multiprocessing, microprocessor-based 
mainframes. 

To provide a single architecture for such a broad assortment of processor environments, the 
PowerPC architecture is both flexible and scalable. 

The flexibility of the PowerPC architecture offers many price/performance options. 
Designers can choose whether to implement architecturally-defined features in hardware or 
in software. For example, a processor designed for a high-end workstation has greater need 
for the performance gained from implementing floating-point normalization and 
denormalization in hardware than a battery-powered, general-purpose computer might. 

The PowerPC architecture is scalable to take advantage of continuing technological 
advances — for example, the continued miniaturization of transistors makes it more feasible 
to implement more execution units and a richer set of optimizing features without being 
constrained by the architecture. 


1-2 


PowerPC Microprocessor Family: The Programming Environments 



The PowerPC architecture defines the following features: 

• Separate 32-entry register files for integer and floating-point instructions. The 
general-purpose registers (GPRs) hold source data for integer arithmetic 
instructions, and the floating-point registers (FPRs) hold source and target data for 
floating-point arithmetic instructions. 

• Instructions for loading and storing data between the memory system and either the 
FPRs or GPRs 

• Uniform-length instructions to allow simplified instruction pipelining and parallel 
processing instruction dispatch mechanisms 

• Nondestructive use of registers for arithmetic instructions in which the second, third, 
and sometimes the fourth operand, typically specify source registers for calculations 
whose results are typically stored in the target register specified by the first operand. 

• A precise exception model (with the option of treating floating-point exceptions 
imprecisely) 

• Floating-point support that includes IEEE-754 floating-point operations 

• A flexible architecture definition that allows certain features to be performed in 
either hardware or with assistance from implementation- specific software 
depending on the needs of the processor design 

• The ability to perform both single- and double-precision floating-point operations 

• User- level instructions for explicitly storing, flushing, and invalidating data in the 
on-chip caches. The architecture also defines special instructions (cache block touch 
instructions) for speculatively loading data before it is needed, reducing the effect of 
memory latency. 

• Definition of a memory model that allows weakly-ordered memory accesses. This 
allows bus operations to be reordered dynamically, which improves overall 
performance and in particular reduces the effect of memory latency on instruction 
throughput. 

• Support for separate instruction and data caches (Harvard architecture) and for 
unified caches 

• Support for both big- and little-endian addressing modes 

• The architecture supports both 32-bit or 64-bit implementations. This document 
typically describes the architecture in terms of the 32-bit implementations. 

This chapter provides an overview of the major characteristics of the PowerPC architecture 
in the order in which they are addressed in this book: 

• Register set and programming model 

• Instruction set and addressing modes 

• Cache implementations 

• Exception model 

• Memory management 
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1.1.1 The 64-Bit PowerPC Architecture and the 32-Bit Subset 

The PowerPC architecture is a 64-bit architecture with a 32-bit subset. It is important to 
distinguish the following modes of operations: 

• 64-bit implementations/64-bit mode — The PowerPC architecture provides 64-bit 
addressing, 64-bit integer data types, and instructions that perform arithmetic 
operations on those data types, as well as other features to support the wider 
addressing range. For example, memory management differs somewhat between 32- 
and 64-bit processors. The processor is configured to operate in 64-bit mode by 
setting a bit in the machine state register (MSR). 

• Processors that implement only the 32-bit portion of the PowerPC architecture 
provide 32-bit effective addresses, which is also the maximum size of integer data 
types. 

• 64-bit implementations/32-bit mode — For compatibility with 32-bit 
implementations, 64-bit implementations can be configured to operate in 32-bit 
mode by clearing the MSRfSF] bit. In 32-bit mode, the effective address is treated 
as a 32-bit address, condition bits, such as overflow and carry bits, are set based on 
32-bit arithmetic (for example, integer overflow occurs when the result exceeds 
32 bits), and the count register (CTR) is tested by branch conditional instructions 
following conventions for 32-bit implementations. All applications written for 32- 
bit implementations will run without modification on 64-bit processors running in 
32-bit mode. 


1.1.2 The Levels of the PowerPC Architecture 

The PowerPC architecture is defined in three levels that correspond to three programming 
environments, roughly described from the most general, user-level instruction set 
environment, to the more specific, operating environment. 

This layering of the architecture provides flexibility, allowing degrees of software 
compatibility across a wide range of implementations. For example, an implementation 
such as an embedded controller may support the user instruction set, whereas it may be 
impractical for it to adhere to the memory management, exception, and cache models. 

The three levels of the PowerPC architecture are defined as follows: 


• PowerPC user instruction set architecture (UISA) — The UISA defines the level of 
the architecture to which user-level (referred to as problem state in the architecture 
specification) software should conform. The UISA defines the base user-level 
instruction set, user-level registers, data types, floating-point memory conventions 
and exception model as seen by user programs, and the memory and programming 
models. The icon shown in the margin identifies text that is relevant with respect to 
the UISA. 



• PowerPC virtual environment architecture (VEA) — The VEA defines additional 
user-level functionality that falls outside typical user-level software requirements. 
The VEA describes the memory model for an environment in which multiple 
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devices can access memory, defines aspects of the cache model, defines cache 
control instructions, and defines the time base facility from a user-level perspective. 
The icon shown in the margin identifies text that is relevant with respect to the YEA. 


1 


Implementations that conform to the PowerPC VEA also adhere to the UISA, but 
may not necessarily adhere to the OEA. 


• PowerPC operating environment architecture (OEA) — The OEA defines supervisor- 
level (referred to as privileged state in the architecture specification) resources 
typically required by an operating system. The OEA defines the PowerPC memory 
management model, supervisor-level registers, synchronization requirements, and 
the exception model. The OEA also defines the time base feature from a supervisor- 
level perspective. The icon shown in the margin identifies text that is relevant with 
respect to the OEA. 


Implementations that conform to the PowerPC OEA also conform to the PowerPC 
UISA and VEA. 


Implementations that adhere to the VEA level are guaranteed to adhere to the UISA level; 
likewise, implementations that conform to the OEA level are also guaranteed to conform to 
the UISA and the VEA levels. 


All PowerPC devices adhere to the UISA, offering compatibility among all PowerPC 
application programs. However, there may be different versions of the VEA and OEA than 
those described here. For example, some devices, such as embedded controllers, may not 
require some of the features as defined by this VEA and OEA, and may implement a 
simpler or modified version of those features. 

The general-purpose PowerPC microprocessors developed jointly by Motorola and IBM 
(such as the PowerPC 601™, PowerPC 603™, PowerPC 603e™, PowerPC 604™, 
PowerPC 604e™, and PowerPC 620™ microprocessors) comply both with the UISA and 
with the VEA and OEA discussed here. In this book, these three levels of the architecture 
are referred to collectively as the PowerPC architecture. 

The distinctions between the levels of the PowerPC architecture are maintained clearly 
throughout this document, using the conventions described in the section “Conventions” on 
page xxxiii of the Preface. 
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1.1.3 Latitude Within the Levels of the PowerPC Architecture 

The PowerPC architecture defines those parameters necessary to ensure compatibility 
among PowerPC processors, but also allows a wide range of options for individual 
implementations. These are as follows: 

• The PowerPC architecture defines some facilities (such as registers, bits within 
registers, instructions, and exceptions) as optional. 

• The PowerPC architecture allows implementations to define additional privileged 
special-purpose registers (SPRs), exceptions, and instructions for special system 
requirements (such as power management in processors designed for very low- 
power operation). 

• There are many other parameters that the PowerPC architecture allows 
implementations to define. For example, the PowerPC architecture may define 
conditions for which an exception may be taken, such as alignment conditions. A 
particular implementation may choose to solve the alignment problem without 
taking the exception. 

• Processors may implement any architectural facility or instruction with assistance 
from software (that is, they may trap and emulate) as long as the results (aside from 
performance) are identical to that specified by the architecture. 

• Some parameters are defined at one level of the architecture and defined more 
specifically at another. For example, the UISA defines conditions that may cause an 
alignment exception, and the OEA specifies the exception itself. 

Because of updates to the PowerPC architecture specification, which are described in this 
document, variances may result between existing devices and the revised architecture 
specification. Those variances are included in Implementation Variances Relative to Rev. 1 
of The Programming Environments Manual. 

1.1.4 Features Not Defined by the PowerPC Architecture 

Because flexibility is an important design goal of the PowerPC architecture, there are many 
aspects of the processor design, typically relating to the hardware implementation, that the 
PowerPC architecture does not define, such as the following: 

• System bus interface signals — Although numerous implementations may have 
similar interfaces, the PowerPC architecture does not define individual signals or the 
bus protocol. For example, the OEA allows each implementation to determine the 
signal or signals that trigger the machine check exception. 

• Cache design — The PowerPC architecture does not define the size, structure, the 
replacement algorithm, or the mechanism used for maintaining cache coherency. 
The PowerPC architecture supports, but does not require, the use of separate 
instruction and data caches. Likewise, the PowerPC architecture does not specify the 
method by which cache coherency is ensured. 
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• The number and the nature of execution units — The PowerPC architecture is a RISC 
architecture, and as such has been designed to facilitate the design of processors that 
use pipelining and parallel execution units to maximize instruction throughput. 
However, the PowerPC architecture does not define the internal hardware details of 
implementations. For example, one processor may execute load and store operations 
in the integer unit, while another may execute these instructions in a dedicated 
load/store unit. 

• Other internal microarchitecture issues — The PowerPC architecture does not 
prescribe which execution unit is responsible for executing a particular instruction; 
it also does not define details regarding the instruction fetching mechanism, how 
instructions are decoded and dispatched, and how results are written back. Dispatch 
and write-back may occur in order or out of order. Also while the architecture 
specifies certain registers, such as the GPRs and FPRs, implementations can 
implement register renaming or other schemes to reduce the impact of data 
dependencies and register contention. 

1.2 The PowerPC Architectural Models 

This section provides overviews of aspects defined by the PowerPC architecture, following 
the same order as the rest of this book. The topics include the following: 


• PowerPC cache model 

• PowerPC exception model 

• PowerPC memory management model 

1.2.1 PowerPC Registers and Programming Model 

The PowerPC architecture defines register-to-register operations for computational 
instructions. Source operands for these instructions are accessed from the architected 
registers or are provided as immediate values embedded in the instruction. The three- 
register instruction format allows specification of a target register distinct from two source 
operand registers. This scheme allows efficient code scheduling in a highly parallel 
processor. Load and store instructions are the only instructions that transfer data between 
registers and memory. The PowerPC registers are shown in Figure 1-1. 


PowerPC registers and programming model 

PowerPC operand conventions 

PowerPC instruction set and addressing modes 


U 

V 

O 
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SUPERVISOR MODEL— OEA 



USER MODEL— UISA 



32 General-Purpose Registers (GPRs) 

32 Floating-Point Registers (FPRs) 
Condition Register (CR) 

Floating-Point Status and Control Register (FPSCR) 
XER 




Link Register (LR) 
Count Register (CTR) 


J 



USER MODEL— VEA 

Time Base Facility (TBU and TBL) 
(For reading) 




Configuration Registers 

Machine State Register (MSR) 

Processor Version Register (PVR) 

Memory Management Registers 

8 Instruction BAT Registers (IBATs) 

8 Data BAT Registers (DBATs) 

SDR1 

16 Segment Registers (SRs) 

Exception Handling Registers 

Data Address Register (DAR) 

DSISR 

Save and Restore Registers (SRR0/SRR1) 
SPRG0-SPRG3 

Floating-Point Exception Cause Register (FPECR) 1 

Miscellaneous Registers 

Time Base Facility (TBU and TBL) (For writing) 
Decrementer Register (DEC) 

Data Address Breakpoint Register (DABR) 1 
Processor Identification Register (PIR) 1 
External Access Register (EAR) 1 


1 Optional 


Figure 1-1. Programming Model — PowerPC Registers 

The programming model incorporates 32 GPRs, 32 FPRs, special-purpose registers 
(SPRs), and several miscellaneous registers. Each implementation may have its own unique 
set of hardware implementation dependent (HID) registers that are not defined by the 
architecture. 

PowerPC processors have two levels of privilege: 

• Supervisor mode — used exclusively by the operating system. Resources defined by 
the OEA can be accessed only supervisor-level software. 

• User mode — used by the application software and operating system software (Only 
resources defined by the UISA and VEA can be accessed by user-level software) 

These two levels govern the access to registers, as shown in Figure 1-1. The division of 
privilege allows the operating system to control the application environment (providing 
virtual memory and protecting operating system and critical machine resources). 


1-8 


PowerPC Microprocessor Family: The Programming Environments 




Instructions that control the state of the processor, the address translation mechanism, and 
supervisor registers can be executed only when the processor is operating in supervisor 
mode. 


User Instruction Set Architecture Registers — All UIS A registers can be accessed 
by all software with either user or supervisor privileges. These registers include the 
32 general-purpose registers (GPRs) and the 32 floating-point registers (FPRs), and 
other registers used for integer, floating-point, and branch instructions. 

Virtual Environment Architecture Registers — The VEA defines the user-level 
portion of the time base facility, which consists of the two 32-bit time base registers. 
These registers can be read by user-level software, but can be written to only by 
supervisor-level software. 

Operating Environment Architecture Registers — SPRs defined by the OEA are 

used for system-level operations such as memory management, exception handling, 
and time-keeping. 





The PowerPC architecture also provides room in the SPR space for implementation- 
specific registers, typically referred to as HID registers. Individual HIDs are not discussed 
in this manual. 


1.2.2 Operand Conventions 

Operand conventions are defined in two levels of the PowerPC architecture — user 
instruction set architecture (UIS A) and virtual environment architecture (VEA). These 
conventions define how data is stored in registers and memory. 


1. 2.2.1 Byte Ordering 

The default mapping for PowerPC processors is big-endian, but the UISA provides the 
option of operating in either big- or little-endian mode. Big-endian byte ordering is shown 
in Figure 1-2. 



MSB 


Byte 0 


Byte 1 


-V 


Byte N (max) 


Big-Endian Byte Ordering 


Figure 1-2. Big-Endian Byte and Bit Ordering 

The OEA defines two bits in the MSR for specifying byte ordering — LE (little-endian 
mode) and ILE (exception little-endian mode). The LE bit specifies whether the processor 
is configured for big-endian or little-endian mode; the ILE bit specifies the mode when an 
exception is taken by being copied into the LE bit of the MSR. A value of 0 specifies big- 
endian mode and a value of 1 specifies little-endian mode. 
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1.2. 2. 2 Data Organization in Memory and Data Transfers 

Bytes in memory are numbered consecutively starting with 0. Each number is the address 
of the corresponding byte. 




Memory operands may be bytes, half words, words, or double words, or, for the load/store 
string/multiple instructions, a sequence of bytes or words. The address of a multiple-byte 
memory operand is the address of its first byte (that is, of its lowest-numbered byte). 
Operand length is implicit for each instruction. 

The operand of a single-register memory access instruction has a natural alignment 
boundary equal to the operand length. In other words, the natural address of an operand is 
an integral multiple of the operand length. A memory operand is said to be aligned if it is 
aligned at its natural boundary; otherwise it is misaligned. 

1.2. 2. 3 Floating-Point Conventions 

The PowerPC architecture adheres to the IEEE-754 standard for 64- and 32-bit floating- 
point arithmetic: 

• Double-precision arithmetic instructions may have single- or double-precision 
operands but always produce double-precision results. 

• Single-precision arithmetic instructions require all operands to be single-precision 
values and always produce single-precision results. Single-precision values are 
stored in double-precision format in the FPRs — these values are rounded such that 
they can be represented in 32-bit, single-precision format (as they are in memory). 

1.2.3 PowerPC Instruction Set and Addressing Modes 

All PowerPC instructions are encoded as single-word (32-bit) instructions. Instruction 
formats are consistent among all instruction types, permitting decoding to occur in parallel 
with operand accesses. This fixed instruction length and consistent format greatly 
simplifies instruction pipelining. 

1.2. 3.1 PowerPC Instruction Set 

Although these categories are not defined by the PowerPC architecture, the PowerPC 
instructions can be grouped as follows: 

• Integer instructions — These instructions are defined by the UISA. They include 
computational and logical instructions. 

— Integer arithmetic instructions 
— Integer compare instructions 
— Logical instructions 
— Integer rotate and shift instructions 

• Floating-point instructions — These instructions, defined by the UISA, include 
floating-point computational instructions, as well as instructions that manipulate the 
floating-point status and control register (FPSCR). 
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— Floating-point arithmetic instructions 

— Floating-point multiply /add instructions 

— Floating-point compare instructions 

— Floating-point status and control instructions 

— Floating-point move instructions 

— Optional floating-point instructions 

Load/store instructions — These instructions, defined by the UISA, include integer 
and floating-point load and store instructions. 

— Integer load and store instructions 

— Integer load and store with byte reverse instructions 

— Integer load and store multiple instructions 

— Integer load and store string instructions 

— Floating-point load and store instructions 

The UISA also provides a set of load/store with reservation instructions (lwarx and 
stwcx.) that can be used as primitives for constructing atomic memory operations. 
These are grouped under synchronization instructions. 

Synchronization instructions — The UISA and VEA define instructions for memory 
synchronizing, especially useful for multiprocessing: 

— Load and store with reservation instructions — These UIS A-defined instructions 
provide primitives for synchronization operations such as test and set, compare 
and swap, and compare memory. 

— The Synchronize instruction (sync) — This UIS A-defined instruction is useful for 
synchronizing load and store operations on a memory bus that is shared by 
multiple devices. 


— Enforce In-Order Execution of I/O (eieio) — The eieio instruction provides an 
ordering function for the effects of load and store operations executed by a 
processor. 



Flow control instructions — These include branching instructions, condition register 
logical instructions, trap instructions, and other instructions that affect the 
instruction flow. 



— The UISA defines numerous instructions that control the program flow, 
including branch, trap, and system call instructions as well as instructions that 
read, write, or manipulate bits in the condition register. 

— The OEA defines two flow control instructions that provide system linkage. 
These instructions are used for entering and returning from supervisor level. 

• Processor control instructions — These instructions are used for synchronizing 
memory accesses and managing caches and translation lookaside buffers (TLBs) 
(and segment registers ). These instructions include move to/from special-purpose 
register instructions (mtspr and mfspr). 
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• Memory/cache control instructions — These instructions provide control of caches, 
TLBs, and segment registers. 

— The YEA defines several cache control instructions. 



— The OEA defines one cache control instruction and several memory control 
instructions. 

• External control instructions — The VEA defines two optional instructions for use 
with special input/output devices. 

NOTE: This grouping of the instructions does not indicate which execution unit executes 
a particular instruction or group of instructions. This is not defined by the 
PowerPC architecture. 



1.2. 3. 2 Calculating Effective Addresses 

The effective address (EA), also called the logical address, is the address computed by the 
processor when executing a memory access or branch instruction or when fetching the next 
sequential instruction. Unless address translation is disabled, this address is converted by 
the MMU to the appropriate physical address. 


NOTE: The architecture specification uses only the term effective address and not logical 
address. 


The PowerPC architecture supports the following simple addressing modes for memory 
access instructions: 

• EA = (rAIO) (register indirect) 

• EA = (rAIO) + offset (including offset = 0) (register indirect with immediate index) 

• EA = (rAIO) + rB (register indirect with index) 

These simple addressing modes allow efficient address generation for memory accesses. 



1.2.4 PowerPC Cache Model 

The VEA and OEA portions of the architecture define aspects of cache implementations for 
PowerPC processors. The PowerPC architecture does not define hardware aspects of cache 
implementations. For example, some PowerPC processors may have separate instruction 
and data caches (Harvard architecture), while others have a unified cache. 


The PowerPC architecture allows implementations to control the following memory access 
modes on a page or block basis: 

• Write-back/write-through mode 

• Caching-inhibited mode 

• Memory coherency 

• Guarded/not guarded against speculative accesses 

Coherency is maintained on a cache block basis, and cache control instructions perform 
operations on a cache block basis. The size of the cache block is implementation- 
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dependent. The term cache block should not be confused with the notion of a block in 
memory, which is described in Section 1.2.6, “PowerPC Memory Management Model.” 


The VEA portion of the PowerPC architecture defines several instructions for cache 
management. These can be used by user-level software to perform such operations as touch 
operations (which cause the cache block to be speculatively loaded), and operations to 
store, flush, or clear the contents of a cache block. The OEA portion of the architecture 
defines one cache management instruction — the Data Cache Block Invalidate (dcbi) 
instruction. 




1.2.5 PowerPC Exception Model 

The PowerPC exception mechanism, defined by the OEA, allows the processor to change 
to supervisor state as a result of external signals, errors, or unusual conditions arising in the 
execution of instructions. When exceptions occur, information about the state of the 
processor is saved to various registers and the processor begins execution at an address 
(exception vector) predetermined for each type of exception. 

Exception handler routines begin execution in supervisor mode. The PowerPC exception 
model is described in detail in Chapter 6, “Exceptions.” 

NOTE: Some aspects of exception conditions are defined at other levels of the 

architecture. For example, floating-point exception conditions are defined by the 
UISA, whereas the exception mechanism is defined by the OEA. 

PowerPC architecture requires that exceptions be handled in program order (excluding the 
optional floating-point imprecise modes and the reset and machine check exception); 
therefore, although a particular implementation may recognize exception conditions out of 
order, they are handled strictly in order. When an instruction-caused exception is 
recognized, any unexecuted instructions that appear earlier in the instruction stream, 
including any that have not yet begun to execute, are required to complete before the 
exception is taken. Any exceptions caused by those instructions must be handled first. 
Likewise, exceptions that are asynchronous and precise are recognized when they occur, 
but are not handled until all instructions currently executing successfully complete 
processing and report their results. 

The OEA supports four types of exceptions: 

• Synchronous, precise 

• Synchronous, imprecise 

• Asynchronous, maskable 

• Asynchronous, nonmaskable 
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1.2.6 PowerPC Memory Management Model 

The PowerPC memory management unit (MMU) specifications are provided by the 
PowerPC OEA. The primary functions of the MMU in a PowerPC processor are to translate 
logical (effective) addresses to physical addresses for memory accesses and I/O accesses 
(most I/O accesses are assumed to be memory-mapped), and to provide access protection 
on a block or page basis. 


O 


NOTE: Many aspects of memory management are implementation-dependent. The 
description in Chapter 7, “Memory Management,” describes the conceptual 
model of a PowerPC MMU; however, PowerPC processors may differ in the 
specific hardware used to implement the MMU model of the OEA. 


PowerPC processors require address translation for two types of transactions — instruction 
accesses and data accesses to memory (typically generated by load and store instructions). 


The entire 4-virtual Gbyte memory space is defined by sixteen 256-Mbyte segments. 
Segments are configured through the 16 segment registers. In addition, the MMU of 
PowerPC processors uses an interim virtual address (52 bits) and hashed page tables in the 
generation of 32-bit physical addresses. 


PowerPC processors also have a block address translation (BAT) mechanism for mapping 
large blocks of memory. Block sizes range from 128 Kbyte to 256 Mbyte and are software- 
selectable. 


Two types of accesses generated by PowerPC processors require address translation: 
instruction accesses, and data accesses to memory generated by load and store instructions. 
The address translation mechanism is defined in terms of segment registers and page tables 
used by PowerPC processors to locate the logical-to-physical address mapping for 
instruction and data accesses. The segment information translates the logical (effective) 
address to an interim virtual address, and the page table information translates the virtual 
address to a physical (real) address. 

Translation lookaside buffers (TLBs) are commonly implemented in PowerPC processors 
to keep recently-used page table entries on-chip. Although their exact characteristics are 
not specified by the architecture, the general concepts that are pertinent to the system 
software are described. 


The block address translation (BAT) mechanism is a software-controlled array that stores 
the available block address translations on-chip. BAT array entries are implemented as pairs 
of BAT registers that are accessible as supervisor special-purpose registers (SPRs); refer to 
Chapter 7, “Memory Management,” for more information. 
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1.3 Changes to this Document 

The document from which this book was developed reflects changes made to the PowerPC 
architecture after the publication of Rev. 0 of “ PowerPC Microprocessor Family: The 
Programming Environments Manual” and before Dec. 13, 1994 (Rev. 0.1). In addition, it 
reflects changes made to the architecture after the publication of Rev. 0. 1 of The 
Programming Environments Manual and before Aug. 6, 1996 (Rev. 1). Although there are 
many changes in this revision of The Programming Environments Manual, the following 
sections summarize only the most significant changes and clarifications to the architecture 
specification. 

1.3.1 The Phasing Out of the Direct-store Function 

This function defined segments that were used to generate direct-store interface accesses 
on the external bus to communicate with specialized I/O devices; it was not optimized for 
performance in the PowerPC architecture and was present for compatibility with older 
devices only. As of this revision of the architecture (Rev. 1), direct-store segments are an 
optional processor feature. However, they are not likely to be supported in future 
implementations and new software should not use them. 

1.3.2 General Additions to and Refinements of the Architecture 

General additions to and refinements of the architecture specification are summarized in 
Table 1-1 and Table 1-2. These tables list changes made to the UISA that are reflected in 
this book and identify the chapters affected by those changes. 


NOTE: Many of the changes made in the UISA are reflected in both the VEA and OEA 
portions of the architecture as well. 

Table 1-1. UISA Changes — Rev. 0 to Rev. 0.1 


Change 

Chapter(s) Affected 

The rules for handling of reserved bits in registers are clarified. 

2 

Clarified that isync does not wait for memory accesses to be performed. 

4, 8 

CR0[0-2] are undefined for some instructions in 64-bit mode. 

4, 8 

Clarified intermediate result with respect to floating-point operations (the intermediate 
result has infinite precision and unbounded exponent range). 

3 

Clarified the definition of rounding such that rounding always occurs (specifically, FR and 

FI flags are always affected) for arithmetic, rounding, and conversion instructions. 

3 

Clarified the definition of the term ‘tiny’ (detected before rounding). 

3 

In D.3.2, “Conversion from Floating-Point Number to Unsigned Fixed-Point Integer Word,” 
changed value in FPR 3 from 2 32 to 2 32 - 1 .. 

D 

Noted additional POWER incompatibility for Store Floating-Point Single (stfs) instruction. 

B 
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Table 1-2. UISA Changes — Rev. 0.1 to Rev. 1.0 


Change 

Chapter(s) Affected 

Although the stfiwx instruction is an optional instruction, it will likely be required for future 
processors. 

4, 8, A 

Added the new Data Cache Block Allocate (dcba) instruction. 

4, 5, 8, A 

Deleted some warnings about generating misaligned little-endian access. 

3 


Table 1-3 and Table 1-4 list changes made to the VEA that are reflected in this book and the 
chapters that are affected by those changes. 

NOTE: Some changes to the UISA are reflected in the VEA and in turn, some changes 
to the VEA affect the OEA as well. 

Table 1-3. VEA Changes — Rev. 0 to Rev. 0.1 


Change 

Chapter(s) Affected 

Clarified conditions under which a cache block is considered modified. 

5 

WIMG bits have meaning only when the effective address is translated. 

2, 5, 7 

Clarified that isync does not wait for memory accesses to be performed. 

4, 5, 7, 8 

Clarified paging implications of eciwx and ecowx. 

4, 5, 7, 8 


Table 1-4. VEA Changes — Rev. 0.1 to Rev. 1.0 


Change 

Chapter(s) Affected 

Added the requirement that caching-inhibited guarded store operations are ordered. 

5 

Clarified use of the debt instruction in keeping instruction cache coherency in the case of a 
combined instruction/data cache in a multiprocessor system. 

5 


Table 1-5 and Table 1-6 list changes made to the OEA that are reflected in this book and the 
chapters that are affected by those changes. 

NOTE: Some changes to the UISA and VEA are reflected in the OEA as well. 


Table 1-5. OEA Changes — Rev. 0 to Rev. 0.1 


Change 

Chapter(s) Affected 

Restricted several aspects of out-of-order operations. 

2, 4, 5, 6, 7 

Clarified instruction fetching and instruction cache paradoxes. 


Specified that IBATs contain W and G bits and that software must not write Is to them. 

2, 7 

Corrected the description of coherence when the W bit differs among processors. 

5 

Clarified that referenced and changed bits are set for virtual pages. 

7 
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Table 1-5. OEA Changes — Rev. 0 to Rev. 0.1 (Continued) 


Change 

Chapter(s) Affected 

Revised the description of changed bit setting to avoid depending on the TLB. 

7 

Tightened the rules for setting the changed bit out of order. 

5, 7 

Specified which multiple DSISR bits may be set due to simultaneous DSI exceptions. 

6 

Removed software synchronization requirements for reading the TB and DEC. 

2 

More flexible DAR setting for a DABR exception. 

6 

Table 1-6. OEA Changes — Rev. 0.1 to Rev. 1.0 

Change 

Chapter(s) Affected 

Changed definition of direct-store segments to an optional processor feature that is not 
likely to be supported in future implementations and new software should not use it. 

2, 6, 7 

Changed the ranges of bits saved from MSR to SRR1 (and restored from SRR1 to MSR on 
rfi) on an exception. 

2, 6 

Clarified the definition of execution synchronization. Also clarified that the mtmsr 
instructions are not execution synchronizing. 

2, 4, 8 

Clarified the use of memory allocated for predefined uses (including the exception 
vectors). 

6, 7 


Revised the page table update synchronization requirements and recommended code 
sequences. 
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Chapter 2. PowerPC Register Set 


This chapter describes the register organization defined by the three levels of the PowerPC 
architecture: 

• User instruction set architecture (UISA) 

• Virtual environment architecture (VEA), and 

• Operating environment architecture (OEA). 

The PowerPC architecture defines register-to-register operations for all computational 
instructions. Source data for these instructions are accessed from the on-chip registers or 
are provided as immediate values embedded in the opcode. The three-register instruction 
format allows specification of a target register distinct from the two source registers, thus 
preserving the original data for use by other instructions and reducing the number of 
instructions required for certain operations. Data is transferred between memory and 
registers with explicit load and store instructions only. 



NOTE: The handling of reserved bits in any register is implementation-dependent. 

Software is permitted to write any value to a reserved bit in a register. However, 
a subsequent reading of the reserved bit returns 0 if the value last written to the 
bit was 0 and returns an undefined value (may be 0 or 1) otherwise. This means 
that even if the last value written to a reserved bit was 1, reading that bit may 
return 0. 


2.1 PowerPC UISA Register Set 

The PowerPC UISA registers, shown in Figure 2-1, can be accessed by either user- or 
supervisor-level instructions (the architecture specification refers to user-level and 
supervisor- level as problem state and privileged state respectively). The general-purpose 
registers (GPRs) and floating-point registers (FPRs) are accessed as instruction operands. 
Access to registers can be explicit (that is, through the use of specific instructions for that 
purpose such as Move to Special-Purpose Register (mtspr) and Move from Special- 
Purpose Register (mfspr) instructions) or implicit as part of the execution of an instruction. 
Some registers are accessed both explicitly and implicitly. 

The number to the right of the register names indicates the number that is used in the syntax 
of the instruction operands to access the register (for example, the number used to access 
theXERis SPR 1). 

NOTE: All registers are 32 bits wide except the Floating-Point Registers. 
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USER MODEL 
UISA 

General-Purpose Registers 


GPRO (32) 


GPR1 (32) 


GPR31 (32) 


Floating-Point Registers 


FPRO (64) 


FPR1 (64) 


FPR31 (64) 


Condition Register 


CR (32) 


Floating-Point Status 
and Control Register 


FPSCR (32) 


XER Register 


XER (32) 


Link Register 


LR (32) 


Count Register 


CTR (32) 


SPR 1 


SPR 8 


SPR 9 


Figure 2-1. UISA Programming Model — User-Level Registers 
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The user-level registers can be accessed by all software with either user or supervisor 
privileges. The user-level registers are: 

• General-purpose registers (GPRs). The general-purpose register file consists of 32 
GPRs designated as GPR0-GPR3 1 . The GPRs serve as data source or destination 
registers for all integer instructions and provide data for generating addresses. See 
Section 2.1.1, “General-Purpose Registers (GPRs),” for more information. 

• Floating-point registers (FPRs). The floating-point register file consists of 32 FPRs 
designated as FPR0-FPR3 1 ; these registers serve as either the source or the 
destination for all floating-point instructions. While the floating-point model 
includes data objects of either single- or double-precision floating-point format, the 
FPRs only contain data in double-precision format. For more information, see 
Section 2.1.2, “Floating-Point Registers (FPRs).” 

• A condition register (CR) is a 32-bit register that is divided into eight 4-bit fields, 
CR0-CR7. This register stores the results of certain arithmetic operations and 
provides a mechanism for testing and branching. For more information, see Section 
2.1.3, “Condition Register (CR).” 

• A floating-point status and control register (FPSCR) which contains all floating- 
point exception signal bits, exception summary bits, exception enable bits, and 
rounding control bits needed for compliance with the IEEE 754 standard. For more 
information, see Section 2.1.4, “Floating-Point Status and Control Register 
(FPSCR).” 

NOTE: The architecture specification refers to exceptions as interrupts. 

• An XER register (XER) which indicates overflows and carry conditions for integer 
operations and the number of bytes to be transferred by the load/store string indexed 
instructions. For more information, see Section 2.1.5, “XER Register (XER).” 

• A link register (LR) which provides the branch target address for the Branch 
Conditional to Link Register (bclrx) instructions, and can optionally be used to hold 
the effective address of the instruction that follows a branch with link update 
instruction in the instruction stream, typically used for loading the return pointer for 
a subroutine. For more information, see Section 2.1.6, “Link Register (LR).” 

• A count register (CTR) which holds a loop count that can be decremented during 
execution of appropriately coded branch instructions. The CTR can also provide the 
branch target address for the Branch Conditional to Count Register (bcctrx) 
instructions. For more information, see Section 2.1.7, “Count Register (CTR).” 

2.1.1 General-Purpose Registers (GPRs) 

Integer data is manipulated in the processor’s 32 GPRs shown in Figure 2- 1 . These registers 
are 32-bit registers. The GPRs are accessed as either source or destination registers in the 
instruction syntax. 
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2.1.2 Floating-Point Registers (FPRs) 

The PowerPC architecture provides thirty-two 64-bit FPRs as shown in Figure 2-2. These 
registers are accessed as either source or destination registers for floating-point instructions. 
Each FPR supports the double-precision floating-point format. Every instruction that 
interprets the contents of an FPR as a floating-point value uses the double-precision 
floating-point format for this interpretation. 

Instructions for all floating-point arithmetic operations use the data located in the FPRs and, 
with the exception of compare instructions, place the result into a FPR. Information about 
the status of floating-point operations is placed into the FPSCR and in some cases, into the 
CR after the completion of instruction execution. For information on how the CR is affected 
for floating-point operations, see Section 2.1.3, “Condition Register (CR).” 

Instructions to load and to store floating-point double precision values transfer 64 bits of 
data between memory and the FPRs with no conversion. 

Instructions to load floating-point single precision values are provided to read single- 
precision floating-point values from memory, convert them to double-precision floating- 
point format, and place them in the target floating-point register. 

Instructions to store single-precision values are provided to read double-precision floating- 
point values from a floating-point register, convert them to single-precision floating-point 
format, and place them in the target memory location. 

Instructions for single- and double-precision arithmetic operations accept values from the 
FPRs in double-precision format. For instructions of single-precision arithmetic and store 
operations, all input values must be representable in single-precision format; otherwise, the 
results placed into the target FPR (or the memory location) and the setting of status bits in 
the FPSCR and in the condition register (if the instruction’s record bit, Re, is set) are 
undefined. 

The floating-point arithmetic instructions produce intermediate results that may be 
regarded as infinitely precise and with unbounded exponent range. This intermediate result 
is normalized or denormalized if required, and then rounded to the destination format. The 
final result is then placed into the target FPR in the double-precision format or in fixed-point 
format, depending on the instruction. Refer to Section 3.3, “Floating-Point Execution 
Models — UISA,” for more information. 



0 63 


Figure 2-2. Floating-Point Registers (FPRs) 
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2.1.3 Condition Register (CR) 

The condition register (CR) is a 32-bit register that reflects the result of certain operations 
and provides a mechanism for testing and branching. The bits in the CR are grouped into 
eight 4-bit fields, CR0-CR7, as shown below. 


CRO 

CR1 

CR2 

CR3 

CR4 

CR5 

CR6 

CR7 


0 34 78 11 12 15 16 19 20 23 24 27 28 31 


Figure 2-3. Condition Register (CR) 

The CR fields can be set in one of the following ways: 

• Specified fields of the CR can be set from a GPR by using the mtcrf instruction. 

• The contents of the XER[0-3] can be moved to another CR field by using the mcrf 
instruction. 

• A specified field of the XER can be copied to a specified field of the CR by using the 
mcrxr instruction. 

• A specified field of the FPSCR can be copied to a specified field of the CR by using 
the mcrfs instruction. 

• Logical instructions of the condition register can be used to perform logical 
operations on specified bits in the condition register. 

• CRO can be the implicit result of an integer instruction. 

• CR1 can be the implicit result of a floating-point instruction. 

• A specified CR field can indicate the result of either an integer or floating-point 
compare instruction. 

NOTE: Branch instructions are provided to test individual CR bits. 
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2. 1.3.1 Condition Register CRO Field Definition 

For all integer instructions, when the CR is set to reflect the result of the operation (that is, 
when Re = 1), and for addic., andi., and andis., the first three bits of CRO are set by an 
algebraic comparison of the result to zero; the fourth bit of CRO is copied from XER[SO]. 
For integer instructions, CR bits 0-3 are set to reflect the result as a signed quantity. 

The CR bits are interpreted as shown in Table 2-1. If any portion of the result is undefined, 
the value placed into the first three bits of CRO is undefined. 


Table 2-1. Bit Settings for CRO Field of CR 


CRO 

Bit 

Description 

0 

Negative (LT) — This bit is set when the result is negative. 

1 

Positive (GT) — This bit is set when the result is positive (and not 
zero). 

2 

Zero (EQ) — This bit is set when the result is zero. 

3 

Summary overflow (SO) — This is a copy of the final state of XER[SO] 
at the completion of the instruction. 


NOTE: If overflow occurs, CRO may not reflect the true (that is, infinitely precise) 
results. 

2. 1.3. 2 Condition Register CR1 Field Definition 

In all floating-point instructions when the CR is set to reflect the result of the operation (that 
is, when the instruction’s record bit, Re, is set), CR1 (bits 4-7 of the CR) is copied from 
bits 0-3 of the FPSCR and indicates the floating-point exception status. For more 
information about the FPSCR, see Section 2.1.4, “Floating-Point Status and Control 
Register (FPSCR).” The bit settings for the CR1 field are shown in Table 2-2. 


Table 2-2. Bit Settings for CR1 Field of CR 


CR1 

Bit 

Description 

4 

Floating-point exception (FX) — This is a copy of the final state of 
FPSCFt[FX] at the completion of the instruction. 

5 

Floating-point enabled exception (FEX) — This is a copy of the final 
state of FPSCR[FEX] at the completion of the instruction. 

6 

Floating-point invalid exception (VX) — This is a copy of the final state 
of FPSCR[VX] at the completion of the instruction. 

7 

Floating-point overflow exception (OX) — This is a copy of the final 
state of FPSCR[OX] at the completion of the instruction. 
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2. 1.3. 3 Condition Register CR n Field — Compare Instruction 

For a compare instruction, when a specified CR field is set to reflect the result of the 
comparison, the bits of the specified field are interpreted as shown in Table 2-3. 


Table 2-3. CRn Field Bit Settings for Compare Instructions 


CRn 

Bit 1 

Description 2 

0 

Less than or floating-point less than (LT, FL). 

For integer compare instructions: rA < SIMM or rB (signed comparison) or 

rA < UIMM or rB (unsigned comparison). 

For floating-point compare instructions: frA < frB. 

1 

Greater than or floating-point greater than (GT, FG). 

For integer compare instructions: rA > SIMM or rB (signed comparison) or 

rA > UIMM or rB (unsigned comparison). 

For floating-point compare instructions: frA > frB. 

2 

Equal or floating-point equal (EQ, FE). 

For integer compare instructions: rA = SIMM, UIMM, or rB. 

For floating-point compare instructions: frA = frB. 

3 

Summary overflow or floating-point unordered (SO, FU). 

For integer compare instructions: This is a copy of the final state of XER[SO] 

at the completion of the instruction. 

For floating-point compare instructions: One or both of frA and frB is a Not a 

Number (NaN). 


Notes : 1 Here, the bit indicates the bit number in any one of the 4-bit subfields, CR0-CR7. 

2 For a complete description of instruction syntax conventions, refer to Table 8-2 on 
page 8-2. 


2.1.4 Floating-Point Status and Control Register (FPSCR) 

The Floating-Point Status and Control Register (FPSCR), shown inFigure 2-4, is used for: 

• Recording exceptions generated by floating-point operations 

• Recording the type of the result produced by a floating-point operation 

• Controlling the rounding mode used by floating-point operations 

• Enabling or disabling the reporting of exceptions (that is, invoking the exception 
handler) 

Bits 0-23 are status bits. Bits 24-31 are control bits. Status bits in the FPSCR are updated 
at the completion of the instruction execution. 

Except for the floating-point enabled exception summary (FEX) and floating-point invalid 
operation exception summary (VX), the exception condition bits in the FPSCR (bits 0-12 
and 21-23) are sticky. Once set, sticky bits remain set until they are cleared by the relevant 
mcrfs, mtfsfi, mtfsf, or mtfsbO instruction. 

FEX and VX are the logical ORs of other FPSCR bits. Therefore, these two bits are not 
listed among the FPSCR bits directly affected by the various instructions. 
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VXIDI 

VXISI 

VXSNAN 


VXZDZ 

VXIMZ 

VXVC 


VXSOFT 

VXSQRT 

VXCVI 


| | Reserved 



Figure 2-4. Floating-Point Status and Control Register (FPSCR) 

A listing of FPSCR bit settings is shown in Table 2-4. 


Table 2-4. FPSCR Bit Settings 


Bit(s) 

Name 

Description 

0 

FX 

Floating-point exception summary. Every floating-point instruction, except mtfsfi and mtfsf, 
implicitly sets FPSCR[FX] if that instruction causes any of the floating-point exception bits in 
the FPSCR to transition from 0 to 1 . The mcrfs, mtfsfi, mtfsf, mtfsbO, and mtfsbl 
instructions can alter FPSCR[FX] explicitly. This is a sticky bit. 

1 

FEX 

Floating-point enabled exception summary. This bit signals the occurrence of any of the 
enabled exception conditions. It is the logical OR of all the floating-point exception bits masked 
by their respective enable bits (FEX = (VX & VE) A (OX & OE) A (UX & UE) A (ZX & ZE) A (XX 
& XE)). The mcrfs, mtfsf, mtfsfi, mtfsbO, and mtfsbl instructions cannot alter FPSCR[FEX] 
explicitly. This is not a sticky bit. 

2 

VX 

Floating-point invalid operation exception summary. This bit signals the occurrence of any 
invalid operation exception. It is the logical OR of all of the invalid operation exceptions. The 
mcrfs, mtfsf, mtfsfi, mtfsbO, and mtfsbl instructions cannot alter FPSCR[VX] explicitly. This 
is not a sticky bit. 

3 

OX 

Floating-point overflow exception. This is a sticky bit. See Section 3.3. 6. 2, “Overflow, 

Underflow, and Inexact Exception Conditions.” 

4 

UX 

Floating-point underflow exception. This is a sticky bit. See Section 3. 3. 6.2.2, “Underflow 
Exception Condition.” 

5 

ZX 

Floating-point zero divide exception. This is a sticky bit. See Section 3.3.6. 1 .2, “Zero Divide 
Exception Condition.” 

6 

XX 

Floating-point inexact exception. This is a sticky bit. See Section 3. 3.6.2. 3, “Inexact Exception 
Condition.” 

FPSCR[XX] is the sticky version of FPSCR[FI], The following rules describe how FPSCR[XX] 
is set by a given instruction: 

• If the instruction affects FPSCR[FI], the new value of FPSCR[XX] is obtained by logically 
ORing the old value of FPSCR[XX] with the new value of FPSCR[FI], 

• If the instruction does not affect FPSCR[FI], the value of FPSCR[XX] is unchanged. 

7 

VXSNAN 

Floating-point invalid operation exception for SNaN. This is a sticky bit. See Section 3.3.6. 1 .1 , 
“Invalid Operation Exception Condition.” 

8 

VXISI 

Floating-point invalid operation exception for oo— oo . This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

9 

VXIDI 

Floating-point invalid operation exception for oo -r oo. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

10 

VXZDZ 

Floating-point invalid operation exception for 0-r 0. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 










































Table 2-4. FPSCR Bit Settings (Continued) 


Bit(s) 

Name 

Description 

11 

VXIMZ 

Floating-point invalid operation exception for °° * 0. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

12 

VXVC 

Floating-point invalid operation exception for invalid compare. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

13 

FR 

Floating-point fraction rounded. The last arithmetic or rounding and conversion instruction that 
rounded the intermediate result incremented the fraction. This bit is NOT sticky. 

See Section 3.3.5, “Rounding.” 

14 

FI 

Floating-point fraction inexact. The last arithmetic or rounding and conversion instruction 
either rounded the intermediate result (producing an inexact fraction) or caused a disabled 
overflow exception. This bit is NOT sticky. 

See Section 3.3.5, “Rounding.” For more information regarding the relationship between 
FPSCR[FI] and FPSCR[XX], see the description of the FPSCR[XX] bit. 

15-19 

FPRF 

Floating-point result flags. For arithmetic, rounding, and conversion instructions, the field is 
based on the result placed into the target register, except that if any portion of the result is 
undefined, the value placed here is undefined. 

1 5 Floating-point result class descriptor (C). Arithmetic, rounding, and conversion 

instructions may set this bit with the FPCC bits to indicate the class of the result as 
shown in Table 2-5. 

16-19 Floating-point condition code (FPCC). Floating-point compare instructions always 

set one of the FPCC bits to one and the other three FPCC bits to zero. Arithmetic, 
rounding, and conversion instructions may set the FPCC bits with the C bit to 
indicate the class of the result. Note: In this case the high-order three bits of the 
FPCC retain their relational significance indicating that the value is less than, 
greater than, or equal to zero. 

16 Floating-point less than or negative (FL or <) 

1 7 Floating-point greater than or positive (FG or >) 

1 8 Floating-point equal or zero (FE or =) 

19 Floating-point unordered or NaN (FU or ?) 

Note: These are NOT sticky bits. 

20 

— 

Reserved 

21 

VXSOFT 

Floating-point invalid operation exception for software request. This is a sticky bit. This bit can 
be altered only by one of the following instructions: mcrfs, mtfsfi, mtfsf, mtfsbO, or mtfsbl. 
See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

22 

VXSQRT 

Floating-point invalid operation exception for invalid square root. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

23 

VXCVI 

Floating-point invalid operation exception for invalid integer convert. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

24 

VE 

Floating-point invalid operation exception enable. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

25 

OE 

IEEE floating-point overflow exception enable. 

See Section 3. 3. 6. 2, “Overflow, Underflow, and Inexact Exception Conditions.” 

26 

UE 

IEEE floating-point underflow exception enable. 

See Section 3. 3. 6. 2. 2, “Underflow Exception Condition.” 

27 

ZE 

IEEE floating-point zero divide exception enable. 

See Section 3. 3. 6. 1.2, “Zero Divide Exception Condition.” 

28 

XE 

Floating-point inexact exception enable. See Section 3. 3. 6. 2.3, “Inexact Exception Condition.” 

















































Table 2-4. FPSCR Bit Settings (Continued) 


Bit(s) 

Name 

Description 

29 

Nl 

Floating-point non-IEEE mode. If this bit is set, results need not conform with IEEE standards 
and the other FPSCR bits may have meanings other than those described here. If the bit is set 
and if all implementation-specific requirements are met and if an lEEE-conforming result of a 
floating-point operation would be a denormalized number, the result produced is zero 
(retaining the sign of the denormalized number). Any other effects associated with setting this 
bit are described in the user’s manual for the implementation (the effects are implementation- 
dependent). 

30-31 

RN 

Floating-point rounding control. See Section 3.3.5, “Rounding.’’ 

00 Round to nearest 

01 Round toward zero 

10 Round toward +infinity 

1 1 Round toward -infinity 


Table 2-5 illustrates the floating-point result flags used by PowerPC processors. The result 
flags correspond to FPSCR bits 15-19. 


Table 2-5. Floating-Point Result Flags in FPSCR 


Result Flags (Bits 15-19) 

Result Value Class 

C 

< 

> 

B 

? 

1 

0 

0 

0 

1 

Quiet NaN 

0 

1 

0 

0 

1 

-Infinity 

0 

1 

0 

0 

0 

-Normalized number 

1 

1 

0 

0 

0 

-Denormalized number 

1 

0 

0 

1 

0 

-Zero 

0 

0 

0 

1 

0 

+Zero 

1 

0 

1 

0 

0 

+Denormalized number 

0 

0 

1 

0 

0 

+Normalized number 

0 

0 

1 

0 

1 

+lnfinity 
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2.1.5 XER Register (XER) 

The XER register (XER) is a 32-bit, user-level register shown in Figure 2-5. 


! | Reserved 





0 0000 0000 0000 0000 0000 0 

Byte count 


0 1 2 3 24 25 31 


Figure 2-5. XER Register 

The bit definitions for XER, shown in Table 2-6, are based on the operation of an 
instruction considered as a whole, not on intermediate results. For example, the result of the 
Subtract from Carrying (subfcx) instruction is specified as the sum of three values. This 
instruction sets bits in the XER based on the entire operation, not on an intermediate sum. 


Table 2-6. XER Bit Definitions 


Bit(s) 

Name 

Description 

0 

SO 

Summary overflow. The summary overflow bit (SO) is set whenever an instruction (except mtspr) 
sets the overflow bit (OV). Once set, the SO bit remains set until it is cleared by an mtspr 
instruction (specifying the XER) or an mcrxr instruction. It is not altered by compare instructions, 
nor by other instructions (except mtspr to the XER, and mcrxr) that cannot overflow. Executing 
an mtspr instruction to the XER, supplying the values zero for SO and one for OV, causes SO to 
be cleared and OV to be set. 

1 

ov 

Overflow. The overflow bit (OV) is set to indicate that an overflow has occurred during execution of 
an instruction. Add, subtract from, and negate instructions having OE = 1 set the OV bit if the 
carry out of the msb is not equal to the carry into the msb, and clear it otherwise. Multiply low and 
divide instructions having OE = 1 set the OV bit if the result cannot be represented in 32 bits 
(mullw, divw, divwu), and clear it otherwise. The OV bit is not altered by compare instructions 
that cannot overflow (except mtspr to the XER, and mcrxr). 

2 

CA 

Carry. The carry bit (CA) is set during execution of the following instructions: 

• Add carrying, subtract from carrying, add extended, and subtract from extended instructions 
set CA if there is a carry out of the msb, and clear it otherwise. 

• Shift right algebraic instructions set CA if any 1 bits have been shifted out of a negative 
operand, and clear it otherwise. 

The CA bit is not altered by compare instructions, nor by other instructions that do not set carry 
(except shift right algebraic, mtspr to the XER, and mcrxr). 

3-24 

— 

Reserved 

25-31 


This field specifies the number of bytes to be transferred by a Load String Word Indexed (Iswx) or 
Store String Word Indexed (stswx) instruction. 
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2.1.6 Link Register (LR) 

The link register (LR) is a 32-bit register which supplies the branch target address for the 
Branch Conditional to Link Register (bclr.r) instructions, and in the case of a branch with 
link update instruction, can be used to hold the logical address of the instruction that 
follows the branch with link update instruction (for returning from a subroutine). The 
format of LR is shown in Figure 2-6. 


Branch Address 

0 31 


Figure 2-6. Link Register (LR) 

NOTE: Although the two least- significant bits can accept any values written to them, 
they are ignored when the LR is used as an address. Both conditional and 
unconditional branch instructions include the option of placing the logical 
address of the instruction following the branch instruction in the LR. 

The link register can be also accessed by the mtspr and mfspr instructions using SPR 8. 
Prefetching instructions along the target path (loaded by an mtspr instruction) is possible 
provided the link register is loaded sufficiently ahead of the branch instruction so that any 
branch prediction hardware can calculate the branch address. Additionally, PowerPC 
processors can prefetch along a target path loaded by a branch and link instruction. 

NOTE: Some PowerPC processors may keep a stack of the LR values most recently set 
by branch with link update instructions. To benefit from these enhancements, use 
of the link register should be restricted to the manner described in 
Section 4. 2. 4. 2, “Conditional Branch Control.” 

2.1.7 Count Register (CTR) 

The count register (CTR) is a 32-bit register. The CTR can hold a loop count that can be 
decremented during execution of branch instructions that contain an appropriately coded 
BO field. If the value in CTR is 0 before being decremented, it is OxFFFF_FFFF (2 32 - 1) 
afterwards. The CTR can also provide the branch target address for the Branch Conditional 
to Count Register (bcctrx) instruction. The CTR is shown in Figure 2-7. 


CTR 


0 


31 


Figure 2-7. Count Register (CTR) 

Prefetching instructions along the target path is also possible provided the count register is 
loaded sufficiently ahead of the branch instruction so that any branch prediction hardware 
can calculate the correct value of the loop count. 
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The count register can also be accessed by the mtspr and mfspr instructions by specifying 
SPR 9. In branch conditional instructions, the BO field specifies the conditions under which 
the branch is taken. The first four bits of the BO field specify how the branch is affected by 
or affects the CR and the CTR. The encoding for the BO field is shown in Table 2-7. 


Table 2-7. BO Operand Encodings 


BO 

Description 

OOOOy 

Decrement the CTR, then branch if the decremented CTR 0 and the condition is IALSE. 

0001 y 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is FALSE. 

001 zy 

Branch if the condition is FALSE. 

OlOOy 

Decrement the CTR, then branch if the decremented CTR 0 and the condition is TRJE. 

OlOly 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is TRUE. 

Ollzy 

Branch if the condition is TRUE. 

IzOOy 

Decrement the CTR, then branch if the decremented CTR 0. 

IzOly 

Decrement the CTR, then branch if the decremented CTR = 0. 

Izlzz 

Branch always. 


Notes: The y bit provides a hint about whether a conditional branch is likely to be taken and is used by 
some PowerPC implementations to improve performance. Other implementations may ignore the 
y bit. 


The z indicates a bit that is ignored. The zbits should be cleared (zero), as they may be assigned 
a meaning in a future version of the PowerPC UISA. 


2.2 PowerPC VEA Register Set — Time Base 


The PowerPC virtual environment architecture (VEA) defines registers in addition to those 
defined by the UISA. The PowerPC VEA register set can be accessed by all software with 
either user- or supervisor-level privileges. Figure 2-8 provides a graphic illustration of the 
PowerPC VEA register set. (Figure 2-8 is similar to that found in Figure 2-1 with the 
additonal PowerPC VEA registers.) 


V 


The PowerPC VEA introduces the time base facility (TB), a 64-bit structure that consists 
of two 32-bit registers — time base upper (TBU) and time base lower (TBL). 


NOTE: The time base registers can be accessed by both user- and supervisor-level 
instructions. In the context of the VEA, user-level applications are permitted 
read-only access to the TB. The OEA defines supervisor-level access to the TB 
for writing values to the TB. See Section 2.3.12, “Time Base Facility The 
general-purpose registers (GPRs), link register (LR), and count register (CTR) 
are 32 bits. These registers are described fully in Section 2.1, “PowerPC UISA 
Register Set.” (TB) — OEA,” for more information. 


In Figure 2-8, the numbers to the right of the register name indicates the number that is used 
in the syntax of the instruction operands to access the register (for example, the number 
used to access the XER is SPR 1). 
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USER MODEL 
UISA 

General-Purpose Registers 


GPRO (32) 


GPR1 (32) 


GPR31 (32) 


Floating-Point Registers 


FPRO (64) 


FPR1 (64) 


FPR31 (64) 


Condition Register 


CR (32) 


Floating-Point Status 
and Control Register 


FPSCR (32) 


XER Register 


XER (32) 


Link Register 


LR (32) 


Count Register 


CTR (32) 


SPR 1 


SPR 8 


SPR 9 


USER MODEL 
VEA 


Time Base Facility 
(For Reading) 


TBL (32) 


TBU (32) 


TBR 268 
TBR 269 


Figure 2-8. VEA Programming Model — User-Level Registers Plus Time Base 
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The time base (TB), shown in Figure 2-9, is a 64-bit structure that contains a 64-bit 
unsigned integer that is incremented periodically. Each increment adds 1 to the low-order 
bit (bit 31 of TBL). The frequency at which the counter is incremented is implementation- 
dependent. 


TBU — Upper 32 bits of time base 


TBL — Lower 32 bits of time base 


o 


31 0 


31 


Figure 2-9. Time Base (TB) 

The TB increments until its value becomes OxFFFF_FFFF_FFFF_FFFF (2 64 - 1). At the 
next increment its value becomes 0x0000_0000_0000_0000. 


NOTE: There is no explicit indication that this has occurred (that is, no exception is 
generated). 

The period of the time base depends on the driving frequency. The TB is implemented such 
that the following requirements are satisfied: 

1 . Loading a GPR from the time base has no effect on the accuracy of the time base. 

2. Storing a GPR to the time base replaces the value in the time base with the value in 
the GPR. 


The PowerPC VEA does not specify a relationship between the frequency at which the time 
base is updated and other frequencies, such as the processor clock. The TB update 
frequency is not required to be constant; however, for the system software to maintain time 
of day and operate interval timers, one of two things is required: 

• The system provides an implementation-dependent exception to software whenever 
the update frequency of the time base changes and a means to determine the current 
update frequency; or 

• The system software controls the update frequency of the time base. 

NOTE: If the operating system initializes the TB to some reasonable value and the 

update frequency of the TB is constant, the TB can be used as a source of values 
that increase at a constant rate, such as for time stamps in trace entries. 

Even if the update frequency is not constant, values read from the TB are monotonically 
increasing (except when the TB wraps from 2 64 - 1 to 0). If a trace entry is recorded each 
time the update frequency changes, the sequence of TB values can be postprocessed to 
become actual time values. 


However, successive readings of the time base may return identical values due to 
implementation-dependent factors such as a low update frequency or initialization. 


Chapter 2. PowerPC Register Set 


2-15 







2.2.1 Reading the Time Base 

The mftb instruction is used to read the time base. The following sections discuss reading 
the time base. For specific details on using the mftb instruction, see Chapter 8, “Instruction 
Set.’' For information on writing the time base, see Section 2.3.12.1, “Writing to the Time 
Base.” 

Tt is not possible to read the entire 64-bit time base in a single instruction. The mftb 
simplified mnemonic moves from the lower half of the time base register (TBL) to a GPR, 
and the mftbu simplified mnemonic moves from the upper half of the time base (TBU) to 
a GPR. 

Because of the possibility of a carry from TBL to TBU occurring between reads of the TBL 
and TBU, a sequence such as the following example is necessary to read the time base: 

loop : 


mftbu 

rx 

#load from TBU 

mftb 

ry 

#load from TBL 

mftbu 

rz 

#load from TBU 

cmpw 

rz , rx 

#see if 'old' = 'new' 

bne 

loop 

#loop if carry occurred 


The comparison and loop are necessary to ensure that a consistent pair of values has been 
obtained. 

2.2.2 Computing Time of Day from the Time Base 

Since the update frequency of the time base is system-dependent, the algorithm for 
converting the current value in the time base to time-of-day is also system-dependent. 

In a system in which the update frequency of the time base may change over time, it is not 
possible to convert an isolated time base value into time of day. Instead, a time base value 
has meaning only with respect to the current update frequency and the time of day that the 
update frequency was last changed. Each time the update frequency changes, either the 
system software is notified of the change via an exception, or else the change was instigated 
by the system software itself. At each such change, the system software must compute the 
current time of day using the old update frequency, compute a new value of ticks-per- 
second for the new frequency, and save the time of day, time base value, and tick rate. 
Subsequent calls to compute time of day use the current time base value and the saved data. 

A generalized service to compute time of day could take the following as input: 

• Time of day at beginning of current epoch 

• Time base value at beginning of current epoch 

• Time base update frequency 

• Time base value for which time of day is desired 

For a PowerPC system in which the time base update frequency does not vary, the first three 
inputs would be constant. 
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2.3 PowerPC OEA Register Set 

The PowerPC operating environment architecture (OEA) completes the discussion of 
PowerPC registers. Figure 2-10 shows a graphic representation of the entire PowerPC 
register set — UISA, VEA, and OEA. In Figure 2-10 the numbers to the right of the register 
name indicates the number that is used in the syntax of the instruction operands to access 
the register (for example, the number used to access the XER is SPR 1). 

All of the SPRs in the OEA can be accessed only by supervisor-level instructions; any 
attempt to access these SPRs with user-level instructions results in a supervisor-level 
exception. Some SPRs are implementation-specific. In some cases, not all of a register’s 
bits are implemented in hardware. 

If a PowerPC processor executes an mtspr/mfspr instruction with an undefined SPR 
encoding, it takes (depending on the implementation) an illegal instruction program 
exception, a privileged instruction program exception, or the results are boundedly 
undefined. See Section 6.4.7, “Program Exception (0x00700),” for more information. 

NOTE: The GPRs, LR, CTR, TBL, MSR, DAR, SDR1, SRR0, SRR1, and 
SPRG0-SPRG3 are 32 bits wide. 
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USER MODEL 
UISA 

General-Purpose Registers 

GPRO (32) 

GPR1 (32) 


GPR31 (32) 


Floating-Point Registers 

FPRO (64) 

FPR1 (64) 


FPR31 (64) 

Condition Register 

CR (32) 


SUPERVISOR MODEL 
OEA 

Configuration Registers 

Machine State Register Processor Version Register 


MSR (32) 


PVR (32) 


SPR 287 


Memory Management Registers 


Instruction BAT Registers 


IBATOU (32) 
IBATOL (32) 
IBAT1U (32) 
I BATH (32) 
IBAT2U (32) 
IBAT2L (32) 
IBAT3U /32) 
IBAT3L (32) 


SDR1 (32) 


SPR 528 
SPR 529 
SPR 530 
SPR 531 
SPR 532 
SPR 533 
SPR 534 
SPR 535 


SPR 25 


Data BAT Registers 

DBATOU (32) SPR 536 

DBATOL (32) SPR 537 

DBAT 1 U (32) SPR 538 

DBAT1L (32) SPR 539 

DBAT2U (32) SPR 540 

DBAT2L (32) SPR 541 

DBAT3U (32) SPR 542 

DBAT3L (32) SPR 543 

Segment Registers 

SRO (32) 

SRI (32) 


Floating-Point Status 
and Control Register 

FPSCR (32) 

XER Register 

XER (32) SPR 


Link Register 

LR (32) 

Count Register 

CTR (32) 


USER MODEL 
VEA 

Time Base Facility 1 
(For Reading) 


TBL (32) 
TBU (32) 


TBR 268 
TBR 269 


j SRI 5 (32) 

Exception Handling Registers 


Data Address Register 

DAR (32) SPR 19 


DSISR 


DSISR (32) 


SPR 18 


SPRGs 

SPRGO (32) 
SPRG1 (32) 
SPRG2 (32) 
SPRG3 (32) 


SPR 272 
SPR 273 
SPR 274 
SPR 275 


Save and Restore Registers 

SRRO (32) SPR 26 

SRR1 (32) SPR 27 

Floating-Point Exception 
Cause Register (Optional) 


FPECR 


SPR 1022 


Miscellaneous Registers 


Time Base Facility 
(For Writing) 


SPR 284 
SPR 285 


TBL (32) 
TBU (32) 

Decrementer 


DEC (32) SPR 22 

Processor Identification 
Register (Optional) 

i pIr I SPR 1023 


Data Address 
Breakpoint Register 
(Optional) 

DABR (32) SPR 1013 

External Access Register 
(Optional) 


EAR (32) 


SPR 282 


Figure 2-10. OEA Programming Model — All Registers 
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The PowerPC OEA supervisor- level registers are: 

• Configuration registers include: 

— A machine state register (MSR) which defines the state of the processor. The 
MSR can be modified by the Move to Machine State Register (mtmsr), System 
Call (sc), and Return from Interrupt (rfi) instructions. It can be read by the Move 
from Machine State Register (mfmsr) instruction. For more information, see 
Section 2.3.1, “Machine State Register (MSR).” 

— A processor version register (PVR) which is a read-only register that identifies 
the version (model) and revision level of the PowerPC processor. For more 
information, see Section 2.3.2, “Processor Version Register (PVR).” 

• Memory management registers include: 

— Block-address translation (BAT) registers. The PowerPC OEA includes eight 
block-address translation registers (BATs), consisting of four pairs of instruction 
BATs (IBAT0U-IBAT3U and IBAT0F-IBAT3F) and four pairs of data BATs 
(DBAT0U-DBAT3U and DBAT0F-DBAT3F). See Figure 2-10 for a list of the 
SPR numbers for the BAT registers. Refer to Section 2.3.3, “BAT Registers,” for 
more information. 

— An SDR1 register which specifies the page table base address used in virtual-to- 
physical address translation. For more information, see Section 2.3.4, “SDR1.” 

NOTE: The physical address is referred to as the real address in the architecture 
specification. 

— Segment registers (SR). The PowerPC OEA defines sixteen 32-bit segment 
registers (SR0-SR15). The fields in the segment register are interpreted 
differently depending on the value of bit 0. For more information, see 
Section 2.3.5, “Segment Registers.” 

• Exception handling registers include: 

— A data address register (DAR) which is set to the effective address generated by 
the a DSI or an alignment exception. For more information, see Section 2.3.6, 
“Data Address Register (DAR).” 

— The SPRG0-SPRG3 registers which are provided for operating system use. For 
more information, see Section 2.3.7, “SPRG0-SPRG3.” 

— A DSISR which defines the cause of DSI and alignment exceptions. For more 
information, refer to Section 2.3.8, “DSISR.” 

— A machine status save/restore register 0 (SRRO). The SRRO register is used to 
save the program effective address on exceptions and return to interrupted 
program when an rfi instruction is executed. For more information, see Section 
2.3.9, “Machine Status Save/Restore Register 0 (SRRO). 

— A machine status save/restore register 1 (SRR1). The SRR1 register is used to 
save MSR register and machine exception status bits and to restore MSR register 
when an rfi instruction is executed. For more information, see Section 2.3.10, 
“Machine Status Save/Restore Register 1 (SRR1).” 
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— A floating-point exception cause register (FPECR) to identify the cause of a 
floating-point exception. (This is an optional register.) 

• Miscellaneous registers include: 

— Time base (TB). The TB is a 64-bit structure that maintains the time of day and 
operates interval timers. The TB consists of two 32-bit registers — time base 
upper (TBU) and time base lower (TBL). 

The time base registers can be accessed by both user- and supervisor-level 
instructions. For more information, see Section 2.3.12, “Time Base Facility 
(TB) — OEA” and Section 2.2, “PowerPC VEA Register Set — Time Base.” 

— Decrementer register (DEC). This register is a 32-bit decrementing counter that 
provides a mechanism for causing a decrementer exception after a 
programmable delay; the frequency is a subdivision of the processor clock. For 
more information, see Section 2.3.13, “Decrementer Register (DEC).” 

— External access register (EAR). This optional register is used in conjunction with 
the eciwx and ecowx instructions. 

The EAR register and the eciwx and ecowx instructions are optional in the 
PowerPC architecture and may not be supported in all PowerPC processors that 
implement the OEA. For more information about the external control facility, see 
Section 4.3.4, “External Control Instructions.” 

— Data address breakpoint register (DABR). This optional register is used to 
control the data address breakpoint facility. 

The DABR is optional in the PowerPC architecture and may not be supported in 
all PowerPC processors that implement the OEA. For more information about 
the data address breakpoint facility, see Section 6.4.3, “DSI Exception 
(0x00300).” 

— Processor identification register (PIR). This optional register is used to hold a 
value that distinguishes an individual processor in a multiprocessor environment. 

2.3.1 Machine State Register (MSR) 

The machine state register (MSR) is a 32-bit register (see Figure 2-11). The MSR defines 
the state of the processor. When an exception occurs, the contents of the MSR register are 
saved in SRR1. A new set of bits are loaded into the MSR as determined by the exception. 
See Table 2-8 for a description for MSR bits. The MSR can also be modified by the mtmsr, 
sc, and rfi instructions. It can be read by the mfmsr instruction. 
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Figure 2-11. Machine State Register (MSR) 
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Table 2-8 shows the bit definitions for the MSR. 


Table 2-8. MSR Bit Settings 


bit(s) 

Name 

Description 

0-12 

— 

Reserved 

13 

POW 

Power management enable 

0 Power management disabled (normal operation mode) 

1 Power management enabled (reduced power mode) 

Note: Power management functions are implementation-dependent. If the 
function is not implemented, this bit is treated as reserved. 

14 

— 

Reserved 

15 

ILE 

Exception little-endian mode. When an exception occurs, this bit is copied into 
MSR[LE] to select the endian mode for the context established by the 
exception. 

16 

EE 

External interrupt enable 

0 While the bit is cleared, the processor delays recognition of external 
interrupts and decrementer exception conditions. 

1 The processor is enabled to take an external interrupt or the decrementer 
exception. 

17 

PR 

Privilege level 

0 The processor can execute both user- and supervisor-level instructions. 

1 The processor can only execute user-level instructions. 

18 

FP 

Floating-point available 

0 The processor prevents dispatch of floating-point instructions, including 
floating-point loads, stores, and moves. 

1 The processor can execute floating-point instructions. 

19 

■ 

Machine check enable 

0 Machine check exceptions are disabled. 

1 Machine check exceptions are enabled. 

20 

FEO 

Floating-point exception mode 0 (see Table 2-9). 

21 

SE 

Single-step trace enable (Optional) 

0 The processor executes instructions normally. 

1 The processor generates a single-step trace exception upon the 
successful execution of the next instruction. 

Note: If the function is not implemented, this bit is treated as reserved. 

22 

BE 

Branch trace enable (Optional) 

0 The processor executes branch instructions normally. 

1 The processor generates a branch trace exception after completing the 
execution of a branch instruction, regardless of whether the branch was 
taken. 

Note: If the function is not implemented, this bit is treated as reserved. 

23 

FE1 

Floating-point exception mode 1 (See Table 2-9). 

24 

— 

Reserved 












































Table 2-8. MSR Bit Settings (Continued) 


bit(s) 

Name 

Description 

25 

IP 

Exception prefix. The setting of this bit specifies whether an exception vector 
offset is prepended with Fs or Os. In the following description, nnnnn is the 
offset of the exception vector. See Table 6-2. 

0 Exceptions are vectored to the physical address 0x000 n_nnnn. 

1 Exceptions are vectored to the physical address 0xFFFn_nnnn. 

In most systems, IP is set to 1 during system initialization, and then cleared to 

0 when initialization is complete. 

26 

IR 

Instruction address translation 

0 Instruction address translation is disabled. 

1 Instruction address translation is enabled. 

For more information, see Chapter 7, “Memory Management.” 

27 

DR 

Data address translation 

0 Data address translation is disabled. 

1 Data address translation is enabled. 

For more information, see Chapter 7, “Memory Management.” 

28-29 

— 

Reserved 

30 

Rl 

Recoverable exception (for system reset and machine check exceptions). 

0 Exception is not recoverable. 

1 Exception is recoverable. 

For more information, see Chapter 6, “Exceptions.” 

31 

LE 

Little-endian mode enable 

0 The processor runs in big-endian mode. 

1 The processor runs in little-endian mode. 


The floating-point exception mode bits (FE0-FE1) are interpreted as shown in 
Table 2-9 


Table 2-9. Floating-Point Exception Mode Bits 


FE0 

FE1 

Mode 

0 

0 

Floating-point exceptions disabled 

0 

1 

Floating-point imprecise nonrecoverable 

1 

0 

Floating-point imprecise recoverable 

1 

1 

Floating-point precise mode 
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Table 2-10 indicates the initial state of the MSR at power up. 


Table 2-10. State of MSR at Power Up 


Bit(s) 

Name 

32-Bit 

Default Value 

0-12 

— 


13 

POW 

0 

14 

— 


15 

ILE 

0 

16 

EE 

0 

17 

PR 

0 

18 

FP 

0 

19 


0 

20 

FE0 

0 

21 

SE 

0 

22 

BE 

0 

23 

FE1 

0 

24 

— 

Unspecified 1 

25 

IP 

I 2 

26 

IR 

0 

27 

DR 

0 

28-29 

— 

Unspecified 1 

30 

Rl 

0 

31 

LE 

0 


1 Unspecified can be either 0 or 1 

2 1 is typical, but might be 0 


2.3.2 Processor Version Register (PVR) 

The processor version register (PVR) is a 32-bit, read-only register which contains a value 
identifying the specific version (model) and revision level of the PowerPC processor (see 
Figure 2-12). The contents of the PVR can be copied to a GPR by the mfspr instruction. 
Read access to the PVR is supervisor-level only; write access is not provided. 


Version 


Revision 


0 


15 16 


31 


Figure 2-12. Processor Version Register (PVR) 
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The PVR consists of two 16-bit fields: 

• Version (bits 0-15) — A 16-bit number that uniquely identifies a particular processor 
version. This number can be used to determine the version of a processor; it may not 
distinguish between different end product models if more than one model uses the 
same processor. 

• Revision (bits 1 6-3 1 ) — A 1 6-bit number that distinguishes between various releases 
of a particular version (that is, an engineering change level). The value of the 
revision portion of the PVR is implementation- specific. The processor revision level 
is changed for each revision of the device. 

2.3.3 BAT Registers 

The BAT registers (BATs) maintain the address translation information for eight blocks of 
memory. The BATs are maintained by the system software and are implemented as eight 
pairs of special-purpose registers (SPRs). Each block is defined by a pair of SPRs called 
upper and lower BAT registers. These BAT registers define the starting addresses and sizes 
of BAT areas. 

The PowerPC OEA defines the BAT registers as eight instruction block-address translation 
(IB AT) registers, consisting of four pairs of instruction BATs, or IBATs (IBAT0U-IBAT3U 
and IBAT0L-IBAT3L) and eight data BATs, or DBATs, (DBAT0U-DBAT3U and 
DBAT0L-DBAT3L). See Figure 2-10 for a list of the SPR numbers for the BAT registers. 


Figure 2-13 and Figure 2-14 show the format of the upper and lower BAT registers for 
32-bit PowerPC processors. 


□ Reserved 


BEPI 

0 000 

BL 




0 14 15 18 19 29 30 31 


Figure 2-13. Upper BAT Register 


□ Reserved 


BRPN 

0 0000 0000 0 

WIMG* 

0 

PP 


0 14 15 24 25 28 29 30 31 

*W and G bits are not defined for IBAT registers. Attempting to write to these bits causes boundedly-undefined results. 


Figure 2-14. Lower BAT Register 
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Table 2-11 describes the bits in the BAT registers. 

Table 2-11. BAT Registers — Field and Bit Descriptions 


Bit(s) Name 




Block effective page index. This field is compared with high-order bits 
of the logical address to determine if there is a hit in that BAT array 
entry. 

Note: The architecture specification refers to logical address as 
effective address. 


Reserved 


Block length. BL is a mask that encodes the size of the block. Values 
for this field are listed in Table 2-12. 


Supervisor mode valid bit. This bit interacts with MSR[PR] to 
determine if there is a match with the logical address. For more 
information, see Section 7.4.2, “Recognition of Addresses in BAT 
Arrays." 


User mode valid bit. This bit also interacts with MSR[PR] to 
determine if there is a match with the logical address. For more 
information, see Section 7.4.2, “Recognition of Addresses in BAT 
Arrays.” 


This field is used in conjunction with the BL field to generate high- 
order bits of the physical address of the block. 


Reserved 


Memory/cache access mode bits 
W Write-through 
I Caching-inhibited 
M Memory coherence 
G Guarded 

Attempting to write to the W and G bits in IBAT registers causes 
boundedly-undefined results. For detailed information about the 
WIMG bits, see Section 5.2.1, “Memory/Cache Access Attributes." 


Reserved 


Protection bits for block. This field determines the protection for the 
block as described in Section 7.4.4, “Block Memory Protection." 


Table 2-12 lists the BAT area lengths encoded in BAT[BL], 

Table 2-12. BAT Area Lengths 



BAT Area 
Length 

BL Encoding 

1 28 Kbytes 

000 0000 0000 

256 Kbytes 

000 0000 0001 

512 Kbytes 

000 0000 001 1 

1 Mbyte 

000 0000 01 1 1 
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Table 2-12. BAT Area Lengths (Continued) 


BAT Area 
Length 

BL Encoding 

2 Mbytes 

000 0000 1111 

4 Mbytes 

000 0001 1111 

8 Mbytes 

000 0011 1111 

1 6 Mbytes 

000 0111 1111 

32 Mbytes 

000 1 1 1 1 1111 

64 Mbytes 

001 1111 1111 

1 28 Mbytes 

011 1111 1111 

256 Mbytes 

111 1111 1111 


Only the values shown in Table 2-12 are valid for the BL field. The rightmost bit of BL is 
aligned with bit 14 of the logical address. A logical address is determined to be within a 
BAT area if the logical address matches the value in the BEPI field. 

The boundary between the cleared bits and set bits (Os and Is) in BL determines the bits of 
logical address that participate in the comparison with BEPI. Bits in the logical address 
corresponding to set bits in BL are cleared for this comparison. Bits in the logical address 
corresponding to set bits in the BL field, concatenated with the 17 bits of the logical address 
to the right (less significant bits) of BL, form the offset within the BAT area. This is 
described in detail in Chapter 7, “Memory Management.” 

The value loaded into BL determines both the length of the BAT area and the alignment of 
the area in both logical and physical address space. The values loaded into BEPI and BRPN 
must have at least as many low-order zeros as there are ones in BL. 

Use of BAT registers is described in Chapter 7, “Memory Management.” 
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2.3.4 SDR1 

The SDR1 is a 32-bit register and is shown in Figure 2-15. 


j~~| Reserved 


HTABORG 


0000 000 


HTABMASK 


0 


15 16 


22 23 


31 


Figure 2-15. SDR1 

The bits of SDR1 are described in Table 2-13. 

Table 2-13. SDR1 Bit Settings 


Bits 

Name 

Description 

0-15 

HTABORG 

The high-order 1 6 bits of the 32-bit physical address of the page table 

16-22 

— 

Reserved 

23-31 

HTABMASK 

Mask for page table address 


The HTABORG field in SDR1 contains the high-order 16 bits of the 32-bit physical address 
of the page table. Therefore, the page table is constrained to lie on a 2 16 -byte (64 Kbytes) 
boundary at a minimum. At least 10 bits from the hash function are used to index into the 
page table. The page table must consist of at least 64 Kbytes (2 10 PTEGs of 64 bytes each). 

The page table can be any size 2" where 16 < n < 25. As the table size is increased, more 
bits are used from the hash to index into the table and the value in HTABORG must have 
more of its low-order bits equal to 0. The HTABMASK field in SDR1 contains a mask value 
that determines how many bits from the hash are used in the page table index. This mask 
must be of the form 0b00...011...1; that is, a string of 0 bits followed by a string of lbits. 
The 1 bits determine how many additional bits (at least 10) from the hash are used in the 
index; HTABORG must have this same number of low-order bits equal to 0. 

See Figure 7-23 for an example of the primary PTEG address generation. 

For example, suppose that the page table is 8,192 (2 13 ), 64-byte PTEGs, for a total size of 
2 19 bytes (512 Kbytes). 

NOTE: A 13-bit index is required. Ten bits are provided from the hash initially, so 3 
additional bits form the hash must be selected. 

The value in HTABMASK must be 0x007 and the value in HTABORG must 
have its low-order 3 bits (bits 13-15 of SDR1) equal to 0. 

This means that the page table must begin on a 
2 3 + 10 + 6 = 2 19 = 512 Kbytes boundary. 

For more information, refer to Chapter 7, “Memory Management.” 
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2.3.5 Segment Registers 

The segment registers contain the segment descriptors. The OEA defines a segment register 
file of sixteen 32-bit registers. Segment registers can be accessed by using the mtsr/mfsr 
and mtsrin/mfsrin instructions. The value of bit 0, the T bit, determines how the remaining 
register bits are interpreted. 

Figure 2-16 shows the format of a segment register when T = 0. 


} | Reserved 
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31 


Figure 2-16. Segment Register Format (T = 0) 

Segment register bit settings when T = 0 are described in Table 2-14. 

Table 2-14. Segment Register Bit Settings (T = 0) 


Bits 

Name 

Description 

0 

T 

T = 0 selects this format 

1 

Ks 

Supervisor-state protection key 

2 

Kp 

User-state protection key 

3 

N 

No-execute protection 

4-7 

— 

Reserved 

8-31 

VSID 

Virtual segment ID 


Figure 2-17 shows the bit definition when T = 1. 


D 



BUID 

Controller-Specific Information 


0 1 2 3 11 12 31 


Figure 2-17. Segment Register Format (T = 1 ) 
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The bits in the segment register when T = 1 are described in Table 2-15. 


Table 2-15. Segment Register Bit Settings (T = 1) 


Bits 

Name 

Description 

0 

T 

T = 1 selects this format. 

1 

Ks 

Supervisor-state protection key 

2 

Kp 

User-state protection key 

3-11 

BUID 

Bus unit ID 

12-31 

CNTLRSPEC 

Device-specific data for I/O controller 


If an access is translated by the block address translation (BAT) mechanism, the BAT 
translation takes precedence and the results of translation using segment registers are not 
used. However, if an access is not translated by a BAT, and T = 0 in the selected segment 
register, the effective address is a reference to a memory-mapped segment. In this case, the 
52-bit virtual address (VA) is formed by concatenating the following: 

• The 24-bit VSID field from the segment register 

• The 16-bit page index, EA[4-19] 

• The 12-bit byte offset, E A [20-31] 

The VA is then translated to a physical (real) address as described in Section 7.5, “Memory 
Segment Model.” 

If T = 1 in the selected segment register (and the access is not translated by a BAT), the 
effective address is a reference to a direct-store segment. No reference is made to the page 
tables. 

NOTE: However, the direct-store facility is being phased out of the architecture and will 
not likely be supported in future devices. Therefore, all new programs should 
write a value of zero to the T bit. 

For further discussion of address translation when T = 1, see Section 7.8, 
“Direct-Store Segment Address Translation.” 

2.3.6 Data Address Register (DAR) 

The DAR is a 32-bit register. The DAR is shown in Figure 2-18. 
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Figure 2-18. Data Address Register (DAR) 
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The effective address (EA) generated by a memory access instruction is placed in the DAR 
if the access causes an exception (for example, an alignment exception). For information, 
see Chapter 6, “Exceptions.” 

2.3.7 SPRG0-SPRG3 

SPRG0-SPRG3 are 32-bit registers. They are provided for general operating system use, 
such as performing a fast state save or for supporting multiprocessor implementations. 

The formats of SPRG0-SPRG3 are shown in Figure 2-19. 



0 3131 


Figure 2-19. SPRG0-SPRG3 

Table 2-16 provides a description of conventional uses of SPRGO through SPRG3. 

Table 2-16. Conventional Uses of SPRG0-SPRG3 


Register 

Description 

SPRGO 

Software may load a unique physical address in this register to identify an area of memory 
reserved for use by the first-level exception handler. This area must be unique for each processor 
in the system. 

SPRG1 

This register may be used as a scratch register by the first-level exception handler to save the 
content of a GPR. That GPR then can be loaded from SPRGO and used as a base register to 
save other GPRs to memory. 

SPRG2 

This register may be used by the operating system as needed. 

SPRG3 

This register may be used by the operating system as needed. 


2.3.8 DSISR 

The 32-bit DSISR, shown in Figure 2-20, identifies the cause of DSI and alignment 
exceptions. 


DSISR 

0 31 


Figure 2-20. DSISR 

For information about bit settings, see Section 6.4.3, “DSI Exception (0x00300),” and 
Section 6.4.6, “Alignment Exception (0x00600).” 
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2.3.9 Machine Status Save/Restore Register 0 (SRRO) 

The SRRO is a 32-bit register. SRRO is used to save the effective address on exceptions 
(interrupts) and return to the interrupted program when an rfi instruction is executed. SRRO 
holds the address of the first instruction that has not been executed in the program where 
the exception occurs. It also holds the EA for the instruction that follows the System Call 
(sc) instruction. The format of SRRO is shown in Figure 2-21. 


□ Reserved 
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Figure 2-21. Machine Status Save/Restore Register 0 (SRRO) 

When an exception occurs, SRRO is set to point to an instruction such that all prior 
instructions have completed execution and no subsequent instruction has completed 
execution. In the case of an error exception the SRRO register is pointing at the instruction 
that caused the error. When an rfi instruction is executed, the contents of SRRO contains the 
address from which to fetch the next instruction to continue program executed. In the case 
of an exception where the offending instruction is to be emulated the contents of SRRO 
must be incremented by 4 to skip over that instruction. The exception type and status bits 
are used to determine the action to be taken. In all cases the instruction pointed to by SRRO 
has not completed execution. 

NOTE: In some implementations, every instruction fetch performed while MSR[IR] = 1 , 
and every instruction execution requiring address translation when MSR[DR] = 

1 , may modify SRRO. 

For information on how specific exceptions affect SRRO, refer to the descriptions of 
individual exceptions in Chapter 6, “Exceptions.” 

2.3.10 Machine Status Save/Restore Register 1 (SRR1) 

The SRR1 is a 32-bit register and is used to save exception status and the machine status 
register (MSR) on exceptions and to restore machine status register (MSR) when an rfi 
instruction is executed. The format of SRR1 is shown in Figure 2-22. 


SRR1 
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Figure 2-22. Machine Status Save/Restore Register 1 (SRR1) 
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When an exception occurs, bits 1-4 and 10-15 of SRR1 are loaded with exception- specific 
information and bits 16-23, 25-27, and 30-31 of the MSR are placed into the 
corresponding bit positions of SRR1. When the rfi is executed, MSR[16-23, 25-27, 30-31] 
are loaded from SRRl[16-23, 25-27, 30-31], 

The remaining bits of SRR1 are defined as reserved. An implementation may define one or 
more of these bits, and in this case, may also cause them to be saved from MSR on an 
exception and restored to MSR from SRR1 on an rfi. 

NOTE: In some implementations, every instruction fetch when MSR[IR] = 1, and every 
instruction execution requiring address translation when MSR[DR] = 1, may 
modify SRR1. 

For information on how specific exceptions affect SRR1, refer to the individual exceptions 
in Chapter 6, “Exceptions.” 

2.3.11 Floating-Point Exception Cause Register (FPECR) 

The FPECR register may be used to identify the cause of a floating-point exception. 

NOTE: The FPECR is an optional register in the PowerPC architecture and may be 

implemented differently (or not at all) in the design of each processor. The user’s 
manual of a specific processor will describe the functionality of the FPECR, if it 
is implemented in that processor. 

2.3.12 Time Base Facility (TB) — OEA 

As described in Section 2.2, “PowerPC VEA Register Set — Time Base,” the time base (TB) 
provides a long-period counter driven by an implementation-dependent frequency. The 
VEA defines user-level read-only access to the TB. Writing to the TB is reserved for 
supervisor-level applications such as operating systems and boot-strap routines. The OEA 
defines supervisor-level, write access to the TB. 

The TB is a volatile resource and must be initialized during reset. Some implementations 
may initialize the TB with a known value; however, there is no guarantee of automatic 
initialization of the TB when the processor is reset. The TB runs continuously after start-up. 

For more information on the user-level aspects of the time base, refer to Section 2.2, 
“PowerPC VEA Register Set — Time Base.” 

2.3.12.1 Writing to the Time Base 

NOTE: Writing to the TB is reserved for supervisor-level software. 

The simplified mnemonics, mttbl and mttbu, write the lower and upper halves of the TB, 
respectively. The simplified mnemonics listed above are for the mtspr instruction; see 
Appendix F, “Simplified Mnemonics,” for more information. The mtspr, mttbl, and mttbu 
instructions treat TBL and TBU as separate 32-bit registers; setting one leaves the other 
unchanged. It is not possible to write the entire 64-bit time base in a single instruction. 
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The TB can be written by a sequence such as: 


lwz 

rx, upper 

#load 64-bit 

value for 

Iwz 

ry, lower 

# TB into rx 

and ry 

li 

mttbl 

rz, 0 

rz 

#force TBL to 

0 

mttbu 

rx 

#set TBU 


mttbl 

ry 

#set TBL 



Provided that no exceptions occur while the last three instructions are being executed, 
loading 0 into TBL prevents the possibility of a carry from TBL to TBU while the time base 
is being initialized. 

For information on reading the time base, refer to Section 2.2.1, “Reading the Time Base.” 

2.3.13 Decrementer Register (DEC) 

The decrementer register (DEC), shown in Figure 2-23, is a 32-bit decrementing counter 
that provides a mechanism for causing a decrementer exception after a programmable 
delay. The DEC frequency is based on the same implementation-dependent frequency that 
drives the time base. 


DEC 


o 


31 


Figure 2-23. Decrementer Register (DEC) 

2.3.13.1 Decrementer Operation 

The DEC counts down, causing an exception (unless masked by MSR[EE]) when it passes 
through zero. The DEC satisfies the following requirements: 

• The operation of the time base and the DEC are coherent (that is, the counters are 
driven by the same fundamental time base). 

• Loading a GPR from the DEC has no effect on the DEC. 

• Storing the contents of a GPR to the DEC replaces the value in the DEC with the 
value in the GPR. 

• Whenever bit 0 of the DEC changes from 0 to 1, a decrementer exception request is 
signaled. Multiple DEC exception requests may be received before the first 
exception occurs; however, any additional requests are canceled when the exception 
occurs for the first request. 

• If the DEC is altered by software and the content of bit 0 is changed from 0 to 1, an 
exception request is signaled. 
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2.3.13.2 Writing and Reading the DEC 

The content of the DEC can be read or written using the mfspr and mtspr instructions, both 
of which are supervisor-level when they refer to the DEC. Using a simplified mnemonic for 
the mtspr instruction, the DEC may be written from GPR rA with the following: 

mtdec rA 

Using a simplified mnemonic for the mfspr instruction, the DEC may be read into GPR rA 
with the following: 

mfdec rA 

2.3.14 Data Address Breakpoint Register (DABR) 

The optional data address breakpoint facility is controlled by an optional SPR, the DABR. 
The DABR is a 32-bit register. The data address breakpoint facility is optional to the 
PowerPC architecture. However, if the data address breakpoint facility is implemented, it 
is recommended, but not required, that it be implemented as described in this section. 

The data address breakpoint facility provides a means to detect accesses to a designated 
double word. The address comparison is done on an effective address, and it applies to data 
accesses only. It does not apply to instruction fetches. 

The DABR is shown in Figure 2-24. 


DAB 

BT 

DW 

DR 


0 28 29 30 31 


DAB 

BT 

DW 

DR 


0 28 29 30 31 


Figure 2-24. Data Address Breakpoint Register (DABR) 

Table 2-17 describes the fields in the DABR. 

Table 2-17. DABR — Bit Settings 


Bit(s) 

Name 

Description 

0-28 

DAB 

Data address breakpoint 

29 

BT 

Breakpoint translation enable 

30 

DW 

Data write enable 

31 

DR 

Data read enable 
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A data address breakpoint match is detected for a load or store instruction if the three 
following conditions are met for any byte accessed: 

• EA[0-28] = DABRfDAB] 

• MSR[DR] = DABR[BT] 

• The instruction is a store and DABR[DW] = 1, or the instruction is a load and 
DABR[DR] = 1. 

Even if the above conditions are satisfied, it is undefined whether a match occurs in the 
following cases: 

• A store string instruction (stwcx. ) in which the store is not performed 

• A load or store string instruction (lswx or stswx) with a zero length 

• A dcbz, dcbz, eciwx, or ecowx instruction. For the purpose of determining whether 
a match occurs, eciwx is treated as a load, and dcbz, dcba, and ecowx are treated as 
stores. 

The cache management instructions other than dcbz and dcba never cause a match. If dcbz 
or dcba causes a match, some or all of the target memory locations may have been updated. 

A match generates a DSI exception. Refer to Section 6.4.3, “DSI Exception (0x00300),” for 
more information on the data address breakpoint facility. 

2.3.15 External Access Register (EAR) 

The EAR is an optional 32-bit SPR that controls access to the external control facility and 
identifies the target device for external control operations. The external control facility 
provides a means for user-level instructions to communicate with special external devices. 
The EAR is shown in Figure 2-25. 


| | Reserved 


E 


000 0000 0000 0000 0000 0000 00 


RID 


0 1 


25 26 31 


Figure 2-25. External Access Register (EAR) 

The high-order bits of the resource ID (RID) field beyond the width of the RID supported 
by a particular implementation are treated as reserved bits. 

The EAR register is provided to support the External Control In Word Indexed (eciwx) and 
External Control Out Word Indexed (ecowx) instructions, which are described in 
Chapter 8, “Instruction Set.” Although access to the EAR is supervisor-level, the operating 
system can determine which tasks are allowed to issue external access instructions and 
when they are allowed to do so. The bit settings for the EAR are described in Table 2-18. 
Interpretation of the physical address transmitted by the eciwx and ecowx instructions and 
the 32-bit value transmitted by the ecowx instruction is not prescribed by the PowerPC 
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OEA but is determined by the target device. The data access of eciwx and ecowx is 
performed as though the memory access mode bits (WIMG) were 0101. 

For example, if the external control facility is used to support a graphics adapter, the ecowx 
instruction could be used to send the translated physical address of a buffer containing 
graphics data to the graphics device. The eciwx instruction could be used to load status 
information from the graphics adapter. 


Table 2-18. External Access Register (EAR) Bit Settings 


Bit 

Name 

Description 

0 

E 

Enable bit 

1 Enabled 

0 Disabled 

If this bit is set, the eciwx and ecowx instructions can perform the 
specified external operation. If the bit is cleared, an eciwx or ecowx 
instruction causes a DSI exception. 

1-25 

— 

Reserved 

26-31 

RID 

Resource ID 


This register can also be accessed by using the mtspr and mfspr instructions. 
Synchronization requirements for the EAR are shown in Table 2-19 and Table 2-20. 

2.3.16 Processor Identification Register (PIR) 

The PIR register is used to differentiate between individual processors in a multiprocessor 
environment. 

NOTE: The PIR is an optional register in the PowerPC architecture and may be 

implemented differently (or not at all) in the design of each processor. The user’s 
manual of a specific processor will describe the functionality of the PIR, if it is 
implemented for that processor. 

2.3.17 Synchronization Requirements for Special Registers and for 
Lookaside Buffers 

Changing the value in certain system registers, and invalidating TLB entries, can cause 
alteration of the context in which data addresses and instruction addresses are interpreted, 
and in which instructions are executed. An instruction that alters the context in which data 
addresses or instruction addresses are interpreted, or in which instructions are executed, is 
called a context- altering instruction. The context synchronization required for context- 
altering instructions is shown in Table 2-19 for data access and Table 2-20 for instruction 
fetch and execution. 

A context-synchronizing exception (that is, any exception except nonrecoverable system 
reset or nonrecoverable machine check) can be used instead of a context- synchronizing 
instruction. In the tables, if no software synchronization is required before (after) a context- 
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altering instruction, the synchronizing instruction before (after) the context-altering 
instruction should be interpreted as meaning the context- altering instruction itself. 

A synchronizing instruction before the context- altering instruction ensures that all 
instructions up to and including that synchronizing instruction are fetched and executed in 
the context that existed before the alteration. A synchronizing instruction after the context- 
altering instruction ensures that all instructions after that synchronizing instruction are 
fetched and executed in the context established by the alteration. Instructions after the first 
synchronizing instruction, up to and including the second synchronizing instruction, may 
be fetched or executed in either context. 

If a sequence of instructions contains context- altering instructions and contains no 
instructions that are affected by any of the context alterations, no software synchronization 
is required within the sequence. 

NOTE: Some instructions that occur naturally in the program, such as the rfi at the end 
of an exception handler, provide the required synchronization. 

No software synchronization is required before altering the MSR (except when altering the 
MSRfPOW] or MSRfLE] bits; see Table 2-19 and Table 2-20), because mtmsr is execution 
synchronizing. No software synchronization is required before most of the other alterations 
shown in Table 2-20, because all instructions before the context-altering instruction are 
fetched and decoded before the context-altering instruction is executed (the processor must 
determine whether any of the preceding instructions are context synchronizing). 

Table 2-19 provides information on data access synchronization requirements. 


Table 2-19. Data Access Synchronization 


Instruction/Event 

Required Prior 

Required After 

Exception 1 

None 

None 

rfi 1 

None 

None 

sc 1 

None 

None 

Trap 1 

None 

None 

mtmsr(ILE) 

None 

None 

mtmsr (PR) 

None 

Context-synchronizing instruction 

mtmsr (ME) 2 

None 

Context-synchronizing instruction 

mtmsr (DR) 

None 

Context-synchronizing instruction 

mtmsr(LE) 3 

— 

— 

mtsr [or mtsrin] 

Context-synchronizing instruction 

Context-synchronizing instruction 

mtspr (SDR1) 4 ’ 5 

sync 

Context-synchronizing instruction 

mtspr (DBAT) 

Context-synchronizing instruction 

Context-synchronizing instruction 

mtspr (DABR) 6 

— 

— 
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Table 2-19. Data Access Synchronization (Continued) 


Instruction/Event 

Required Prior 

Required After 

mtspr (EAR) 

Context-synchronizing instruction 

Context-synchronizing instruction 

tlbie 7 

Context-synchronizing instruction 

Context-synchronizing instruction or 

sync 

tibia 7 

Context-synchronizing instruction 

Context-synchronizing instruction or 

sync 


Notes: 

1 Synchronization requirements tor changing the power conserving mode are implementation-dependent. 

2 A context synchronizing instruction is required after modification of the MSR[ME] bit to ensure that the 
modification takes effect for subsequent machine check exceptions, which may not be recoverable and 
therefore may not be context synchronizing. 

3 Synchronization requirements for changing from one endian mode to the other are implementation-dependent. 

4 SDR1 must not be altered when MSR[DR] = 1 or MSR[IR] = 1 ; if it is, the results are undefined. 

5 A sync instruction is required before the mtspr instruction because SDR1 identifies the page table and thereby 
the location of the referenced and changed (R and C) bits. To ensure that R and C bits are updated in the 
correct page table, SDR1 must not be altered until all R and C bit updates due to instructions before the mtspr 
have completed. A sync instruction guarantees this synchronization of R and C bit updates, while neither a 
context synchronizing operation nor the instruction fetching mechanism does so. 

6 Synchronization requirements for changing the DABR are implementation-dependent. 

7 Multiprocessor systems have other requirements to synchronize TLB invalidate. 

Table 2-20 provides information on instruction access synchronization requirements. 


Table 2-20. Instruction Access Synchronization 


Instruction/Event 

Required Prior 

Required After 

Exception 1 

None 

None 

rfi 1 

None 

None 

sc 1 

None 

None 

Trap 1 

None 

None 

mtmsr (POW) 1 

— 

— 

mtmsr (ILE) 

None 

None 

mtmsr (EE) 2 

None 

None 

mtmsr (PR) 

None 

Context-synchronizing instruction 

mtmsr (FP) 

None 

Context-synchronizing instruction 

mtmsr (ME) 3 

None 

Context-synchronizing instruction 

mtmsr (FEO, FE1) 

None 

Context-synchronizing instruction 

mtmsr (SE, BE) 

None 

Context-synchronizing instruction 

mtmsr (IP) 

None 

None 

mtmsr (IR) 4 

None 

Context-synchronizing instruction 
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Table 2-20. Instruction Access Synchronization (Continued) 


Instruction/Event 

Required Prior 

Required After 

mtmsr (Rl) 

None 

None 

mtmsr (LE) 5 

— 

— 

mtsr [or mtsrin] 4 

None 

Context-synchronizing instruction 

mtspr (SDR1) 6 ' 7 

sync 

Context-synchronizing instruction 

mtspr (1 BAT) 4 

None 

Context-synchronizing instruction 

mtspr (DEC) 8 

None 

None 

tlbie 9 

None 

Context-synchronizing instruction or sync 

tibia 9 

None 

Context-synchronizing instruction or sync 


Notes: 

1 Synchronization requirements for changing the power conserving mode are implementation-dependent. 

2 The effect of altering the EE bit is immediate as follows: 

• If an mtmsr sets the EE bit to 0, neither an external interrupt nor a decrementer exception can occur after 
the instruction is executed. 

• If an mtmsr sets the EE bit to 1 when an external interrupt, decrementer exception, or higher priority 
exception exists, the corresponding exception occurs immediately after the mtmsr is executed, and 
before the next instruction is executed in the program that set MSR[EE]. 

3 A context synchronizing instruction is required after modification of the MSR[ME] bit to ensure that the 
modification takes effect for subsequent machine check exceptions, which may not be recoverable and therefore 
may not be context synchronizing. 

4 The alteration must not cause an implicit branch in physical address space. The physical address of the context- 
altering instruction and of each subsequent instruction, up to and including the next context synchronizing 
instruction, must be independent of whether the alteration has taken effect. 

5 Synchronization requirements for changing from one endian mode to the other are implementation-dependent. 

6 SDR1 must not be altered when MSR[DR] = 1 or MSR[IR] = 1 ; if it is, the results are undefined. 

7 A sync instruction is required before the mtspr instruction because SDR1 identifies the page table and thereby 
the location of the referenced and changed (R and C) bits. To ensure that R and C bits are updated in the correct 
page table, SDR1 must not be altered until all R and C bit updates due to instructions before the mtspr have 
completed. A sync instruction guarantees this synchronization of R and C bit updates, while neither a context 
synchronizing operation nor the instruction fetching mechanism does so. 

8 The elapsed time between the content of the decrementer becoming negative and the signaling of the 
decrementer exception is not defined. 

9 Multiprocessor systems have other requirements to synchronize TLB invalidate. 
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Chapter 3. Operand Conventions 


This chapter describes the operand conventions as they are represented in two levels of the fjj" 
PowerPC architecture — user instruction set architecture (UISA) and virtual environment 
architecture (VEA). Detailed descriptions are provided of conventions used for storing [V_ 
values in registers and memory, accessing PowerPC registers, and representing data in these 
registers in both big- and little-endian modes. Additionally, the floating-point data formats 
and exception conditions are described. Refer to Appendix D, “Floating-Point Models,” for 
more information on the implementation of the IEEE floating-point execution models. 


3.1 Data Organization in Memory and Data Transfers 

In a PowerPC microprocessor-based system, bytes in memory are numbered consecutively 
starting with 0. Each number is the address of the corresponding byte. Memory operands 
may be bytes, half-words, words, or double words, or, for the load and store multiple and 
the load and store string instructions, a sequence of bytes or words. The address of a 
memory operand is the address of its first byte (that is, of its lowest-numbered byte). 
Operand length is implicit for each instruction. 



The following sections describe the concepts of alignment and byte ordering of data, and 
their significance to the PowerPC architecture. 


3.1.1 Aligned and Misaligned Accesses 

The operand of a single-register memory access instruction has a natural alignment 
boundary equal to the operand length. In other words, the natural address of an operand is 
an integral multiple of the operand length. A memory operand is said to be aligned if it is 
aligned at its natural boundary; otherwise it is misaligned. Instructions are always four 
bytes long and word-aligned. 

Operands for single-register memory access instructions have the characteristics shown in 
Table 3-1. (Although not permitted as memory operands, quad words are shown because 
quad-word alignment is desirable for certain memory operands.) 


Table 3-1. Memory Operand Alignment 


Operand 

Length 

Aligned Addr^-SI) 1 

Byte 

8 bits 

xxxx 

Half word 

2 bytes 

xxxO 













Table 3-1. Memory Operand Alignment (Continued) 


Operand 

Length 

Aligned Addr^-SI ) 1 

Word 

4 bytes 

xxOO 

Double word 

8 bytes 

xOOO 

Quad word 

1 6 bytes 

0000 


Note: 1 An x in an address bit position indicates that the bit can be 0 or 1 
independent of the state of other bits in the address. 


The concept of alignment is also applied more generally to data in memory. For example, 
a 12-byte data item is said to be word-aligned if its address is a multiple of four. 

Some instructions require their memory operands to have certain alignment. In addition, 
alignment may affect performance. For single-register memory access instructions, the best 
performance is obtained when memory operands are aligned. 

3.1.2 Byte Ordering 

If individual data items were indivisible, the concept of byte ordering would be 
unnecessary. The order of bits or groups of bits within the smallest addressable unit of 
memory is irrelevant, because nothing can be observed about such order. Order matters 
only when scalars, which the processor and programmer regard as indivisible quantities, 
can be made up of more than one addressable unit of memory. 

For PowerPC processors, the smallest addressable memory unit is the byte (8 bits), and 
scalars are composed of one or more sequential bytes. When a 32-bit scalar is moved from 
a register to memory, it occupies four consecutive bytes in memory, and a decision must be 
made regarding the order of these bytes in these four addresses. 

Although the choice of byte ordering is arbitrary, only two orderings are practical — big- 
endian and little-endian. The PowerPC architecture supports both big- and little-endian 
byte ordering. The default byte ordering is big-endian. 

3. 1.2.1 Big-Endian Byte Ordering 

For big-endian scalars, the most-significant byte (MSB) is stored at the lowest (or starting) 
address while the least-significant byte (LSB) is stored at the highest (or ending) address. 
This is called big-endian because the big end of the scalar comes first in memory. 

3. 1.2. 2 Little-Endian Byte Ordering 

For little-endian scalars, the least-significant byte is stored at the lowest (or starting) 
address while the most-significant byte is stored at the highest (or ending) address. This is 
called little-endian because the little end of the scalar comes first in memory. 
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3.1.3 Structure Mapping Examples 

Figure 3-1 shows a C programming example that contains an assortment of scalars and one 
array of characters (a string). The value presumed to be in each structure element is shown 
in hexadecimal in the comments (except for the character array, which is represented by a 
sequence of characters, each enclosed in single quote marks). 

struct { 


int 

a; 

/* 

Oxl 1 12_ 

.1314 



word 

*/ 

double 

b; 

/* 

0x2122_ 

_232 4_ 

2526. 

J2128 

double word 

*/ 

char * 

c; 

/* 

0x3132_ 

.3334 



word 

*/ 

char 

d[7] ; 

/* 

'L' , 'M' 

, 'N\ 

'O', 

'P\ 'Q\ 'R' 

array of bytes 

*/ 

short 

e; 

/* 

0x5152 




half word 

*/ 

int 

f; 

/* 

0x61 62_ 

.63 64 



word 

*/ 


} S; 


Figure 3-1. C Program Example — Data Structure S 

The data structure S is used throughout this section to demonstrate how the bytes that 
comprise each element (a, b , c, d, e, and f) are mapped into memory. 
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3. 1.3.1 Big-Endian Mapping 

The big-endian mapping of the structure, S , is shown in Figure 3-2. Addresses are shown 
in hexadecimal below each byte. The content of each byte, as shown in the preceding C 
programming example, is shown in hexadecimal and, for the character array, as characters 
enclosed in single quote marks. 

NOTE: The most-significant byte of each scalar is at the lowest address. 


Contents 

11 

12 

13 

14 

(x) 

(x) 

(x) 

(x) 

Address 

00 

01 

02 

03 

04 

05 

06 

07 

Contents 

21 

22 

23 

24 

25 

26 

27 

28 

Address 

08 

09 

0A 

0B 

OC 

0D 

0E 

OF 

Contents 

31 

32 

33 

34 

1’ 

‘M’ 

‘N’ 

‘O’ 

Address 

10 

11 

12 

13 

14 

15 

16 

17 

Contents 

‘P’ 

■Q' 

‘R’ 

(x) 

51 

52 

(x) 

(x) 

Address 

18 

19 

1 A 

IB 

1C 

ID 

IE 

IF 

Contents 

61 

62 

63 

64 

(x) 

(x) 

(x) 

(x) 

Address 

20 

21 

22 

23 

24 

25 

26 

27 


Figure 3-2. Big-Endian Mapping of Structure S 

The structure mapping introduces padding (skipped bytes indicated by (x) in Figure 3-2) in 
the map in order to align the scalars on their proper boundaries — four bytes between 
elements a and b , one byte between elements d and e, and two bytes between elements e 
and/. 

NOTE: The padding is dependent on the compiler; it is not a function of the architecture. 
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3. 1.3. 2 Little-Endian Mapping 

Figure 3-3 shows the structure, S , using little-endian mapping. 

NOTE: The least-significant byte of each scalar is at the lowest address. 


Contents 

14 

13 

12 

11 

(x) 

(x) 

(x) 

(x) 

Address 

00 

01 

02 

03 

04 

05 

06 

07 

Contents 

28 

27 

26 

25 

24 

23 

22 

21 

Address 

08 

09 

0A 

0B 

OC 

0D 

0E 

OF 

Contents 

34 

33 

32 

31 

V 

‘M’ 

‘N’ 

'O’ 

Address 

10 

11 

12 

13 

14 

15 

16 

17 

Contents 

‘P’ 

'Q' 

‘Ft’ 

(x) 

52 

51 

(x) 

(x) 

Address 

18 

19 

1 A 

IB 

1C 

ID 

IE 

IF 

Contents 

64 

63 

62 

61 

(x) 

(x) 

(x) 

(x) 

Address 

20 

21 

22 

23 

24 

25 

26 

27 


Figure 3-3. Little-Endian Mapping of Structure S 

Figure 3-3 shows the sequence of double words laid out with addresses increasing from left 
to right. Programmers familiar with little-endian byte ordering may be more accustomed to 
viewing double words laid out with addresses increasing from right to left, as shown in 
Figure 3-4. This allows the little-endian programmer to view each scalar in its natural byte 
order of MSB to LSB. However, to demonstrate how the PowerPC architecture provides 
both big- and little-endian support, this section uses the convention of showing addresses 
increasing from left to right, as in Figure 3-3. 
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Contents 

(x) 

(x) 

(x) 

(x) 

11 

12 

13 

14 

Address 

07 

06 

05 

04 

03 

02 

01 

00 

Contents 

21 

22 

23 

24 

25 

26 

27 

28 

Address 

OF 

0E 

0D 

OC 

0B 

0A 

09 

08 

Contents 

‘O’ 

‘N’ 

‘M’ 

‘U 

31 

32 

33 

34 

Address 

17 

16 

15 

14 

13 

12 

11 

10 

Contents 

(x) 

(x) 

51 

52 

(x) 

‘Ft’ 

‘O' 

‘P’ 

Address 

IF 

IE 

ID 

1C 

IB 

1 A 

19 

18 

Contents 

(x) 

(x) 

(x) 

(x) 

61 

62 

63 

64 

Address 

27 

26 

25 

24 

23 

22 

21 

20 


Figure 3-4. Little-Endian Mapping of Structure S — Alternate View 

3.1.4 PowerPC Byte Ordering 

The PowerPC architecture supports both big- and little-endian byte ordering. The default 
byte ordering is big-endian. However, the code sequence used to switch from big- to little- 
endian mode may differ among processors. 

The PowerPC architecture defines two bits in the MSR for specifying byte ordering — LE 
(little-endian mode) and ILE (exception little-endian mode). The LE bit specifies the endian 
mode in which the processor is currently operating and ILE specifies the mode to be used 
when an exception handler is invoked. That is, when an exception occurs, the ILE bit (as 
set for the interrupted process) is copied into MSRfLE] to select the endian mode for the 
context established by the exception. For both bits, a value of 0 specifies big-endian mode 
and a value of 1 specifies little-endian mode. 

The PowerPC architecture also provides load and store instructions that reverse byte 
ordering. These instructions have the effect of loading and storing data in the endian mode 
opposite from that which the processor is operating. See Section 4.2. 3. 4, “Integer Load and 
Store with Byte-Reverse Instructions,” for more information on these instructions. 

3. 1.4.1 Aligned Scalars in Little-Endian Mode 

Chapter 4, “Addressing Modes and Instruction Set Summary,” describes the effective 
address calculation for the load and store instructions. For processors in little-endian mode, 
the effective address is modified before being used to access memory. The three low-order 
address bits of the effective address are exclusive-ORed (XOR) with a three-bit value that 
depends on the length of the operand (1, 2, 4, or 8 bytes), as shown in Table 3-2. This 
address modification is called ‘munging’. 
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NOTE: 


Although the process (munging) is described in the architecture, the actual term 
‘munging’ is not defined or used in the specification. However, the term is 
commonly used to describe the effective address modifications necessary for 
converting big-endian addressed data to little-endian addressed data. 


Table 3-2. EA Modifications 


Data Width (Bytes) 

EA Modification 

8 

No change 

4 

XOR with Obi 00 

2 

XOR with Obi 10 

1 

XOR with Obi 1 1 


The munged physical address is passed to the cache or to main memory, and the specified 
width of the data is transferred (in big-endian order — that is, MSB at the lowest address, 
LSB at the highest address) between a GPR or FPR and the addressed memory locations 
(as modified). 

Munging makes it appear to the processor that individual aligned scalars are stored as little- 
endian, when in fact they are stored in big-endian order, but at different byte addresses 
within double words. Only the address is modified, not the byte order. 

Taking into account the preceding description of munging, in little-endian mode, structure 
S is placed in memory as shown in Figure 3-5. 
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Figure 3-5. Munged Little-Endian Structure S as Seen by the Memory Subsystem 
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NOTE: The mapping shown in Figure 3-5 is not a true little-endian mapping of the 

structure S. However, because the processor munges the address when accessing 
memory, the physical structure S shown in Figure 3-5 appears to the processor as 
the structure S shown in Figure 3-6. 
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Figure 3-6. Munged Little-Endian Structure S as Seen by Processor 

As seen by the program executing in the processor, the mapping for the structure S 
(Figure 3-6) is identical to the little-endian mapping shown in Figure 3-3. However, from 
outside of the processor, the addresses of the bytes making up the structure S are as shown 
in Figure 3-5. 

These addresses match neither the big-endian mapping of Figure 3-2 nor the true little- 
endian mapping of Figure 3-3. This must be taken into account when performing I/O 
operations in little-endian mode; this is discussed in Section 3. 1.4.5, “PowerPC 
Input/Output Data Transfer Addressing in Little-Endian Mode.” 
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3. 1.4. 2 Misaligned Scalars in Little-Endian Mode 

Performing an XOR operation on the low-order bits of the address works only if the scalar 
is aligned on a boundary equal to a multiple of its length. Figure 3-7 shows a true little- 
endian mapping of the four-byte word Oxl 1 12_13 14, stored at address 05. 
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Figure 3-7. True Little-Endian Mapping, Word Stored at Address 05 

For the true little-endian example in Figure 3-7, the least-significant byte (0x14) is stored 
at address 0x05, the next byte (0x13) is stored at address 0x06, the third byte (0x12) is 
stored at address 0x07, and the most-significant byte (0x11) is stored at address 0x08. 

When a PowerPC processor, in little-endian mode, issues a single-register load or store 
instruction with a misaligned effective address, it may take an alignment exception. In this 
case, a single-register load or store instruction means any of the integer load/store, 
load/store with byte-reverse, memory synchronization (excluding sync), or floating-point 
load/store (including stfiwx) instructions. PowerPC processors in little-endian mode are not 
required to invoke an alignment exception when such a misaligned access is attempted. The 
processor may handle some or all such accesses without taking an alignment exception. 

The PowerPC architecture requires that half-words, words, and double words be placed in 
memory such that the little-endian address of the lowest-order byte is the effective address 
computed by the load or store instruction; the little-endian address of the next-lowest-order 
byte is one greater, and so on. However, because PowerPC processors in little-endian mode 
munge the effective address, the order of the bytes of a misaligned scalar must be as if they 
were accessed one at a time. 

Using the same example as shown in Figure 3-7, when the least-significant byte (0x14) is 
stored to address 0x05, the address is XORed with Obi 1 1 to become 0x02. When the next 
byte (0x13) is stored to address 0x06, the address is XORed with Obi 11 to become 0x01. 
When the third byte (0x12) is stored to address 0x07, the address is XORed with Obi 1 1 to 
become 0x00. Finally, when the most-significant byte (0x11) is stored to address 0x08, the 
address is XORed with Obi 11 to become OxOF. Figure 3-8 shows the misaligned word, 
stored by a little-endian program, as seen by the memory subsystem. 
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Figure 3-8. Word Stored at Little-Endian Address 05 as Seen by the Memory 

Subsystem 

NOTE: The misaligned word in this example spans two double words. The two parts of 
the misaligned word are not contiguous as seen by the memory system. An 
implementation may support some but not all misaligned little-endian accesses. 
For example, a misaligned little-endian access that is contained within a double 
word may be supported, while one that spans double words may cause an 
alignment exception. 

3. 1.4. 3 Nonscalars 

The PowerPC architecture has two types of instructions that handle nonscalars (multiple 
instances of scalars): 

• Load and store multiple instructions 

• Load and store string instructions 

Because these instructions typically operate on more than one word-length scalar, munging 
cannot be used. These types of instructions cause alignment exception conditions when the 
processor is executing in little-endian mode. Although string accesses are not supported, 
they are inherently byte-based operations, and can be broken into a series of word-aligned 
accesses. 

3. 1.4.4 PowerPC Instruction Addressing in Little-Endian Mode 

Each PowerPC instruction occupies an aligned word of memory. PowerPC processors fetch 
and execute instructions as if the current instruction address is incremented by four for each 
sequential instruction. When operating in little-endian mode, the instruction address is 
munged as described in Section 3. 1.4.1, ‘Aligned Scalars in Little-Endian Mode,” for 
fetching word-length scalars; that is, the instruction address is XORed with OblOO. A 
program is thus an array of little-endian words with each word fetched and executed in 
order (not including branches). 
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All instruction addresses visible to an executing program are the effective addresses that are 
computed by that program, or, in the case of the exception handlers, effective addresses that 
were or could have been computed by the interrupted program. These effective addresses 
are independent of the endian mode. Examples for little-endian mode include the 
following: 

• An instruction address placed in the link register by branch and link operation, or an 
instruction address saved in an SPR when an exception is taken, is the address that 
a program executing in little-endian mode would use to access the instruction as a 
word of data using a load instruction. 

• An offset in a relative branch instruction reflects the difference between the 
addresses of the branch and target instructions, where the addresses used are those 
that a program executing in little-endian mode would use to access the instructions 
as data words using a load instruction. 

• A target address in an absolute branch instruction is the address that a program 
executing in little-endian mode would use to access the target instruction as a word 
of data using a load instruction. 

• The memory locations that contain the first set of instructions executed by each kind 
of exception handler must be set in a manner consistent with the endian mode in 
which the exception handler is invoked. Thus, if the exception handler is to be 
invoked in little-endian mode, the first set of instructions comprising each kind of 
exception handler must appear in memory with the instructions within each double 
word reversed from the order in which they are to be executed. 

3. 1.4. 5 PowerPC Input/Output Data Transfer Addressing in Little- 
Endian Mode 

For a PowerPC system running in big-endian mode, both the processor and the memory 
subsystem recognize the same byte as byte 0. However, this is not true for a PowerPC 
system running in little-endian mode because of the munged address bits when the 
processor accesses memory. 

For I/O transfers in little-endian mode to transfer bytes properly, they must be performed 
as if the bytes transferred were accessed one at a time, using the little-endian address 
modification appropriate for the single-byte transfers (that is, the lowest order address bits 
must be XORed with Obi 11). This does not mean that I/O operations in little-endian 
PowerPC systems must be performed using only one-byte-wide transfers. Data transfers 
can be as wide as desired, but the order of the bytes within double words must be as if they 
were fetched or stored one at a time. That is, for a true little-endian I/O device, the system 
must provide a mechanism to munge and unmunge the addresses and reverse the bytes 
within a double word (MSB to FSB). 
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In earlier processors, I/O operations can also be performed with certain devices by storing 
to or loading from addresses that are associated with the devices (this is referred to as 
direct-store interface operations). However, the direct-store facility is being phased out of 
the architecture and will not likely be supported in future devices. Care must be taken with 
such operations when defining the addresses to be used because these addresses are 
subjected to munging as described in Section 3. 1.4.1, “Aligned Scalars in Little-Endian 
Mode.” A load or store that maps to a control register on an external device may require the 
bytes of the value transferred to be reversed. If this reversal is required, the load and store 
with byte-reverse instructions may be used. See Section 4.2. 3. 4, “Integer Load and Store 
with Byte-Reverse Instructions,” for more information on these instructions. 


3.2 Effect of Operand Placement on 
Performance — VEA 



The PowerPC VEA states that the placement (location and alignment) of operands in 
memory affects the relative performance of memory accesses. The best performance is 
guaranteed if memory operands are aligned on natural boundaries. For more information 
on memory access ordering and atomicity, refer to Section 5.1, “The Virtual Environment.” 


3.2.1 Summary of Performance Effects 

To obtain the best performance across the widest range of PowerPC processor 
implementations, the programmer should assume the performance model described in 
Table 3-3 and Table 3-4. with respect to the placement of memory operands. 

The performance of accesses varies depending on: 

• Operand size 

• Operand alignment 

• Endian mode (big-endian or little-endian) 

• Crossing no boundary 

• Crossing a cache block boundary 

• Crossing a page boundary 

• Crossing a BAT boundary 

• Crossing a segment boundary 
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Table 3-3 applies when the processor is in big-endian mode. 

Table 3-3. Performance Effects of Memory Operand Placement, Big-Endian Mode 


Operand 

Boundary Crossing 

Size 

Byte 

Alignment 

None 

Cache Block 

Page 

BAT/Segment 

Integer 





8 byte 

8 

Optimal 

— 

— 

— 


4 

Good 

Good 

Poor 

Poor 


<4 

Poor 

Poor 

Poor 

Poor 

4 byte 

4 

Optimal 

— 

— 

— 


<4 

Good 

Good 

Poor 

Poor 

2 byte 

2 

Optimal 

— 

— 

— 


<2 

Good 

Good 

Poor 

Poor 

1 byte 

1 

Optimal 

— 

— 

— 

Imw, stmw 

4 

Good 

Good 

Good 1 

Poor 

String 

— 

Good 

Good 

Poor 

Poor 

Floating Point 

None 

Cache Block 

Page 

BAT/Segment 

8 byte 

8 

Optimal 

— 

— 

— 


4 

Good 

Good 

Poor 

Poor 


<4 

Poor 

Poor 

Poor 

Poor 

4 byte 

4 

Optimal 

— 

— 

— 


<4 

Poor 

Poor 

Poor 

Poor 


Note: 1 Crossing a page boundary where the memory/cache access attributes of the two pages 
differ is equivalent to crossing a segment boundary, and thus has poor performance. 
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Table 3-4. applies when the processor is in little-endian mode. 

Table 3-4. Performance Effects of Memory Operand Placement, Little-Endian Mode 


Operand 

Boundary Crossing 

Size 

Byte 

Alignment 

None 

Cache Block 

Page 

BAT/Segment 

Integer 





8 byte 


8 

Optimal 

— 

— 

— 



<8 

Poor 

Poor 

Poor 

Poor 

4 byte 


4 

Optimal 

— 

— 

— 



<4 

Poor 

Poor 

Poor 

Poor 

2 byte 


2 

Optimal 

— 

— 

— 



<2 

Poor 

Poor 

Poor 

Poor 

1 byte 

1 

Optimal 

— 

— 

— 

Floating Point 

None 

Cache Block 

Page 

BAT/Segment 

8 byte 


8 

Optimal 

— 

— 

— 



<8 

Poor 

Poor 

Poor 

Poor 

4 byte 


4 

Optimal 

— 

— 

— 



<4 

Poor 

Poor 

Poor 

Poor 


The load/store multiple and the load/store string instructions are supported only in big- 
endian mode. The load/store multiple instructions are defined by the PowerPC architecture 
to operate only on aligned operands. The load/store string instructions have no alignment 
requirements. 

3.2.2 Instruction Restart 

If a memory access crosses a page, BAT, or segment boundary, a number of conditions 
could abort the execution of the instruction after part of the access has been performed. For 
example, this may occur when a program attempts to access a page it has not previously 
accessed or when the processor must check for a possible change in the memory/cache 
access attributes when an access crosses a page boundary. When this occurs, the processor 
or the operating system may restart the instruction. If the instruction is restarted, some bytes 
at that location may be loaded from or stored to the target location a second time. 

The following rules apply to memory accesses with regard to restarting the instruction: 

• Aligned accesses — A single-register instruction that accesses an aligned operand is 
never restarted (that is, it is not partially executed). 

• Misaligned accesses — A single-register instruction that accesses a misaligned 
operand may be restarted if the access crosses a page, BAT, or segment boundary, or 
if the processor is in little-endian mode. 
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• Load/store multiple, load/store string instructions — These instructions may be 
restarted if, in accessing the locations specified by the instruction, a page, BAT, or 
segment boundary is crossed. 

The programmer should assume that any misaligned access in a segment might be restarted. 
When the processor is in big-endian mode, software can ensure that misaligned accesses 
are not restarted by placing the misaligned data in BAT areas, as BAT areas have no internal 
protection boundaries. Refer to Section 7.4, “Block Address Translation,” for more 
information on BAT areas. 


3.3 Floating-Point Execution Models — UISA 


There are two kinds of floating-point instructions defined for the PowerPC architecture: 
computational and noncomputational. The computational instructions consist of those 
operations defined by the IEEE-754 standard for 64- and 32-bit arithmetic (those that 
perform addition, subtraction, multiplication, division, extracting the square root, rounding 
conversion, comparison, and combinations of these) and the multiply-add and reciprocal 
estimate instructions defined by the architecture. The noncomputational floating-point 
instructions consist of the floating-point load, store, and move instructions. While both the 
computational and noncomputational instructions are considered to be floating-point 
instructions governed by the MSR[FP] bit (that allows floating-point instructions to be 
executed), only the computational instructions are considered floating-point operations 
throughout this chapter. 



The IEEE standard requires that single-precision arithmetic be provided for single- 
precision operands. The standard permits double-precision arithmetic instructions to have 
either (or both) single-precision or double-precision operands, but states that single- 
precision arithmetic instructions should not accept double-precision operands. The 
guidelines are as follows: 

• Double-precision arithmetic instructions may have single-precision operands but 
always produce double-precision results. 

• Single-precision arithmetic instructions require all operands to be single-precision 
and always produce single-precision results. 


For arithmetic instructions, conversion from double- to single-precision must be done 
explicitly by software, while conversion from single- to double-precision is done implicitly 
by the processor. 


All PowerPC implementations provide the equivalent of the following execution models to 
ensure that identical results are obtained. The definition of the arithmetic instructions for 
infinities, denormalized numbers, and NaNs follow conventions described in the following 
sections. 


Appendix D. Floating-Point Models has additional detailed information on the execution 
models for IEEE operations as well as the other floating-point instructions. 
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Although the double-precision format specifies an 11 -bit exponent, exponent arithmetic 
uses two additional bit positions to avoid potential transient overflow conditions. An extra 
bit is required when denormalized double-precision numbers are prenormalized. A second 
bit is required to permit computation of the adjusted exponent value in the following 
examples when the corresponding exception enable bit is 1 (exceptions are referred to as 
interrupts in the architecture specification): 

• Underflow during multiplication using a denormalized operand 

• Overflow during division using a denormalized divisor 

3.3.1 Floating-Point Data Format 

The PowerPC UISA defines the representation of a floating-point value in two different 
binary, fixed-length formats. The format is a 32-bit format for a single-precision floating- 
point value or a 64-bit format for a double-precision floating-point value. The single- 
precision format may be used for data in memory. The double-precision format can be used 
for data in memory or in floating-point registers (FPRs). 

The lengths of the exponent and the fraction fields differ between these two formats. The 
layout of the single-precision format is shown in Figure 3-9; the layout of the double- 
precision format is shown in Figure 3-10. 


S 


EXP 


FRACTION 


0 1 


8 9 


31 


Figure 3-9. Floating-Point Single-Precision Format 


s 


EXP 


FRACTION 


01 1112 


63 


Figure 3-10. Floating-Point Double-Precision Format 

Values in floating-point format consist of three fields: 

• S (sign bit) 

• EXP (exponent + bias) 

• FRACTION (fraction) 

If only a portion of a floating-point data item in memory is accessed, as with a load or store 
instruction for a byte or half word (or word in the case of floating-point double-precision 
format), the value affected depends on whether the PowerPC system is using big- or little- 
endian byte ordering, which is described in Section 3.1.2, “Byte Ordering.” 

Big-endian mode is the default. 
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For numeric values, the significant! consists of a leading implied bit concatenated on the 
right with the FRACTION. This leading implied bit is a 1 for normalized numbers and a 0 
for denormalized numbers and is the first bit to the left of the binary point. Values 
representable within the two floating-point formats can be specified by the parameters 
listed in Table 3-5. 

Table 3-5. IEEE Floating-Point Fields 


Parameter 

Single-Precision 

Double-Precision 

Exponent bias 

+127 

+1023 

Maximum exponent 
(unbiased) 

+127 

+1023 

Minimum exponent 
(unbiased) 

-126 

-1022 

Format width 

32 bits 

64 bits 

Sign width 

1 bit 

1 bit 

Exponent width 

8 bits 

11 bits 

Fraction width 

23 bits 

52 bits 

Significand width 

24 bits 

53 bits 


The true value of the exponent can be determined by subtracting 127 for single-precision 
numbers and 1023 for double-precision numbers. This is shown in Table 3-6. 

NOTE: Two exponent values are reserved to represent special-case values: 

— Setting all bits indicates that the value is infinity, or NaN. 

— Clearing all bits indicates that the number is either zero, or denormalized. 


Table 3-6. Biased Exponent Format 


Biased Exponent 
(Binary) 

Single-Precision 

(Unbiased) 

Double-Precision 

(Unbiased) 

11 11 

Reserved for infinities and NaNs 

11 10 

+127 

+1023 

11 01 

+126 

+1022 










10 00 

1 

1 

01 11 

0 

0 

01 10 

-1 

-1 
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Table 3-6. Biased Exponent Format (Continued) 


Biased Exponent 
(Binary) 

Single-Precision 

(Unbiased) 

Double-Precision 

(Unbiased) 







00 01 

-126 

-1022 

00 00 

Reserved for zeros and denormalized numbers 


3. 3. 1.1 Value Representation 

The PowerPC UISA defines numerical and nonnumerical values representable within 
single- and double-precision formats. The numerical values are approximations to the real 
numbers and include the normalized numbers, denormalized numbers, and zero values. The 
nonnumerical values representable are the positive and negative infinities and the NaNs. 
The positive and negative infinities are adjoined to the real numbers but are not numbers 
themselves, and the standard rules of arithmetic do not hold when they appear in an 
operation. They are related to the real numbers by order alone. It is possible, however, to 
define restricted operations among numbers and infinities as defined below. The relative 
location on the real number line for each of the defined numerical entities is shown in 
Figure 3-11. Tiny values include denormalized numbers and all numbers that are too small 
to be represented for a particular precision format; they do not include zero values. 
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Figure 3-11. Approximation to Real Numbers 

The positive and negative NaNs are encodings that convey diagnostic information such as 
the representation of uninitialized variables and are not related to the numbers, or each 
other by order or value. 

Table 3-7 describes each of the floating-point formats. 

Table 3-7. Recognized Floating-Point Numbers 


Sign Bit 

Biased Exponent 

Implied Bit 

Fraction 

Value 

0 

Maximum 

X 

Nonzero 

NaN 

0 

Maximum 

X 

Zero 

-t-lnfinity 

0 

0 < Exponent < Maximum 

1 

X 

+Normalized 

0 

0 

0 

Nonzero 

+Denormalized 
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Table 3-7. Recognized Floating-Point Numbers (Continued) 


Sign Bit 

Biased Exponent 

Implied Bit 

Fraction 

Value 

0 

0 

X 

Zero 

+0 

1 

0 

X 

Zero 

-0 

1 

0 

0 

Nonzero 

-Denormalized 

1 

0 < Exponent < Maximum 

i 

X 

-Normalized 

1 

Maximum 

X 

Zero 

-Infinity 

1 

Maximum 

X 

Nonzero 

NaN 


The following sections describe floating-point values defined in the architecture. 

3. 3. 1.2 Binary Floating-Point Numbers 

Binary floating-point numbers are machine-representable values used to approximate real 
numbers. Three categories of numbers are supported — normalized numbers, denormalized 
numbers, and zero values. 

3. 3. 1.3 Normalized Numbers ( NORM) 

The values for normalized numbers have a biased exponent value in the range: 

• 1-254 in single-precision format 

• 1-2046 in double-precision format 

The implied unit bit is one. Normalized numbers are interpreted as follows: 

NORM = (-l) s x 2 E x (1. fraction) 

The variable (s) is the sign, (E) is the unbiased exponent, and (1. fraction) is the significand 
composed of a leading unit bit (implied bit) and a fractional part. The format for normalized 
numbers is shown in Figure 3-12. 


MIN < EXPONENT < MAX 
(BIASED) 


FRACTION = ANY BIT PATTERN 


SIGN BIT, 0 OR 1 


Figure 3-12. Format for Normalized Numbers 

The ranges covered by the magnitude (M) of a normalized floating-point number are 
approximated in the following decimal representation: 

Single-precision format: 

1.2xl(T 38 < M < 3.4x10 38 

Double-precision format: 

2 . 2x1 0~ 308 < M < 1.8x10 308 
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3.3.1 .4 Zero Values ( 0) 

Zero values have a biased exponent value of zero and fraction of zero. This is shown in 
Figure 3-13. Zeros can have a positive or negative sign. The sign of zero is ignored by 
comparison operations (that is, comparison regards +0 as equal to -0). Arithmetic with zero 
results is always exact and does not signal any exception, except when an exception occurs 
due to the invalid operations as described in Section 3. 3. 6. 1.1, '‘Invalid Operation 
Exception Condition.” Rounding a zero only affects the sign. 



EXPONENT =0 
(BIASED) 

FRACTION = 0 


SIGN BIT, 0 OR 1 


Figure 3-13. Format for Zero Numbers 

3.3.1 .5 Denormalized Numbers ( DENORM) 

Denormalized numbers have a biased exponent value of zero and a nonzero fraction. The 
format for denormalized numbers is shown in Figure 3-14. 



EXPONENT = 0 
(BIASED) 

FRACTION = ANY NONZERO 

BIT PATTERN 


SIGN BIT, 0 OR 1 


Figure 3-14. Format for Denormalized Numbers 

Denormalized numbers are nonzero numbers smaller in magnitude than the normalized 
numbers. They are values in which the implied unit bit is zero. Denormalized numbers are 
interpreted as follows: 

DENORM = (-l) s x 2 Emin x (0. fraction) 

The value Emin is the minimum unbiased exponent value for a normalized number (-126 
for single-precision, -1022 for double-precision). 
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3.3.1 .6 Infinities (±oo) 

These are values that have the maximum biased exponent value of 255 in the single- 
precision format, 2047 in the double-precision format, and a zero fraction value. They are 
used to approximate values greater in magnitude than the maximum normalized value. 
Infinity arithmetic is defined as the limiting case of real arithmetic, with restricted 
operations defined among numbers and infinities. Infinities and the real numbers can be 
related by numeric ordering in the following sense: 

< every finite number < +°° 

The format for infinities is shown in Figure 3-15. 



EXPONENT = MAXIMUM 
(BIASED) 

FRACTION = 0 


SIGN BIT, 0 OR 1 


Figure 3-15. Format for Positive and Negative Infinities 

Arithmetic using infinite numbers is always exact and does not signal any exception, except 
when an exception occurs due to the invalid operations as described in Section 3. 3. 6. 1.1, 
'‘Invalid Operation Exception Condition.” 

3. 3. 1.7 Not a Numbers (NaNs) 

NaNs have the maximum biased exponent value and a nonzero fraction. The format for 
NaNs is shown in Figure 3-16. The sign bit of NaN does not show an algebraic sign; rather, 
it is simply another bit in the NaN. If the highest-order bit of the fraction field is a zero, the 
NaN is a signaling NaN; otherwise it is a quiet NaN (QNaN). 



EXPONENT = MAXIMUM 
(BIASED) 

FRACTION = ANY NONZERO 

BIT PATTERN 


SIGN BIT (ignored) 


Figure 3-16. Format for NaNs 

Signaling NaNs signal exceptions when they are specified as arithmetic operands. 

Quiet NaNs represent the results of certain invalid operations, such as attempts to perform 
arithmetic operations on infinities or NaNs, when the invalid operation exception is 
disabled (FPSCRfVE] = 0). Quiet NaNs propagate through all operations, except floating- 
point round to single-precision, ordered comparison, and conversion to integer operations, 
and signal exceptions only for ordered comparison and conversion to integer operations. 
Specific encodings in QNaNs can thus be preserved through a sequence of operations and 
used to convey diagnostic information to help identify results from invalid operations. 
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When a QNaN results from an operation because an operand is a NaN or because a QNaN 
is generated due to a disabled invalid operation exception, the following rule is applied to 
determine the QNaN to be stored as the result: 

If (frA) is a NaN 
Then frD 4 — (frA) 

Else if (frB) is a NaN 

Then if instruction is frsp 

Then frD (frB) [0-34] | | (29)0 
Else frD 4— (frB) 

Else if (frC) is a NaN 
Then frD 4- (frC) 

Else if generated QNaN 

Then frD 4— generated QNaN 

If the operand specified by frA is a NaN, that NaN is stored as the result. Otherwise, if the 
operand specified by frB is a NaN (if the instruction specifies an frB operand), that NaN is 
stored as the result, with the low-order 29 bits cleared. Otherwise, if the operand specified 
by frC is a NaN (if the instruction specifies an frC operand), that NaN is stored as the result. 
Otherwise, if a QNaN is generated by a disabled invalid operation exception, that QNaN is 
stored as the result. If a QNaN is to be generated as a result, the QNaN generated has a sign 
bit of zero, an exponent field of all ones, and a highest-order fraction bit of one with all 
other fraction bits zero. An instruction that generates a QNaN as the result of a disabled 
invalid operation generates this QNaN. This is shown in Figure 3-17. 


0 

111...1 

1000. ...0 


SIGN BIT (ignored) 


Figure 3-17. Representation of Generated QNaN 


3.3.2 Sign of Result 

The following rules govern the sign of the result of an arithmetic operation, when the 
operation does not yield an exception. These rules apply even when the operands or results 
are zero (0), or : 

• The sign of the result of an addition operation is the sign of the source operand 
having the larger absolute value. If both operands have the same sign, the sign of the 
result of an addition operation is the same as the sign of the operands. The sign of 
the result of the subtraction operation, x - y, is the same as the sign of the result of 
the addition operation, x + (-y). 

• When the sum of two operands with opposite sign, or the difference of two operands 
with the same sign, is exactly zero, the sign of the result is positive in all rounding 
modes except round toward negative infinity (- ),in which case the sign is negative. 

• The sign of the result of a multiplication or division operation is the XOR of the 
signs of the source operands. 
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• The sign of the result of a round to single-precision or convert to/from integer 
operation is the sign of the source operand. 

• The sign of the result of a square root or reciprocal square root estimate operation is 
always positive, except that the square root of -0 is -0 and the reciprocal square root 
of -0 is -infinity. 

For multiply-add/sub tract instructions, these rules are applied first to the multiplication 
operation and then to the addition/sub traction operation (one of the source operands to the 
addition/subtraction operation is the result of the multiplication operation). 

3.3.3 Normalization and Denormalization 

The intermediate result of an arithmetic or Floating Round to Single-Precision (frspx) 
instruction may require normalization and/or denormalization. When an intermediate result 
consists of a sign bit, an exponent, and a nonzero significand with a zero leading bit, the 
result must be normalized (and rounded) before being stored to the target. 

A number is normalized by shifting its significand left and decrementing its exponent by 
one for each bit shifted until the leading significand bit becomes one. The guard and round 
bits are also shifted, with zeros shifted into the round bit; see Section D.l — Execution Model 
for IEEE Operations — for information about the guard and round bits. During 
normalization, the exponent is regarded as if its range were unlimited. 

If an intermediate result has a nonzero significand and an exponent that is smaller than the 
minimum value that can be represented in the format specified for the result, this value is 
referred to as ‘tiny’ and the stored result is determined by the rules described in Section 
3. 3. 6. 2. 2, “Underflow Exception Condition.” These rules may involve denormalization. 
The sign of the number does not change. 

An exponent can become tiny in either of the following circumstances: 

• As the result of an arithmetic or Floating Round to Single-Precision (frspx) 
instruction, or 

• As the result of decrementing the exponent in the process of normalization. 

Normalization is the process of coercing the leading significand bit to be a 1 while 
denormalization is the process of coercing the exponent into the target format's range. 

In denormalization, the significand is shifted to the right while the exponent is incremented 
for each bit shifted until the exponent equals the format’s minimum value. The result is then 
rounded. If any significand bits are lost due to the rounding of the shifted value, the result 
is considered inexact. 

The sign of the number does not change. 
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3.3.4 Data Handling and Precision 

There are specific instructions for moving floating-point data between the FPRs and 
memory. For double-precision format data, the data is not altered during the move. For 
single-precision data, the format is converted to double-precision format when data is 
loaded from memory into an FPR. A format conversion from double- to single-precision is 
performed when data from an FPR is stored as single-precision. These operations do not 
cause floating-point exceptions. 

All floating-point arithmetic, move, and select instructions use floating-point double- 
precision format. 

Floating-point single-precision formats are obtained by using the following four types of 
instructions: 

• Load floating-point single-precision instructions — These instructions access a 
single-precision operand in single-precision format in memory, convert it to double- 
precision, and load it into an FPR. Floating-point exceptions do not occur during the 
load operation. 

• The floating round to single-precision (frspjt) instruction — The frsp\ instruction 
rounds a double-precision operand to single-precision, checking the exponent for 
single-precision range and handling any exceptions according to respective enable 
bits in the FPSCR. The instruction places that operand into an FPR as a double- 
precision operand. For results produced by single-precision arithmetic instructions 
and by single-precision loads, this operation does not alter the value. 

• Single-precision arithmetic instructions — These instructions take operands from 
the FPRs in double-precision format, perform the operation as if it produced an 
intermediate result correct to infinite precision and with unbounded range, and then 
force this intermediate result to fit in single-precision format. Status bits in the 
FPSCR and in the condition register are set to reflect the single-precision result. The 
result is then converted to double-precision format and placed into an FPR. The 
result falls within the range supported by the single-precision format. 

Source operands for these instructions must be representable in single-precision 
format. Otherwise, the result placed into the target FPR and the setting of status bits 
in the FPSCR, and in the condition register if update mode is selected, are undefined. 

• Store floating-point single-precision instructions — These instructions convert a 
double-precision operand to single-precision format and store that operand into 
memory. If the operand requires denormalization in order to fit in single-precision 
format, it is automatically denormalized prior to being stored. No exceptions are 
detected on the store operation (the value being stored is effectively assumed to be 
the result of an instruction of one of the preceding three types). 

When the result of a Load Floating-Point Single (lfs), Floating Round to Single-Precision 
(frspx), or single-precision arithmetic instruction is stored in an FPR, the low-order 29 
fraction bits are zero. This is shown in Figure 3-18. 
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Figure 3-18. Single-Precision Representation in an FPR 

The frsp.v instruction allows conversion from double- to single-precision with appropriate 
exception checking and rounding. This instruction should be used to convert double- 
precision floating-point values (produced by double-precision load and arithmetic 
instructions) to single-precision values before storing them into single-format memory 
elements or using them as operands for single-precision arithmetic instructions. Values 
produced by single-precision load and arithmetic instructions can be stored directly, or used 
directly as operands for single-precision arithmetic instructions, without being preceded by 
an frsp.v instruction. 


3 


A single-precision value can be used in double-precision arithmetic operations. The reverse 
is true only if the double-precision value can be represented in single-precision format. 
Some implementations may execute single-precision arithmetic instructions faster than 
double-precision arithmetic instructions. Therefore, if double-precision accuracy is not 
required, using single-precision data and instructions may speed operations in some 
implementations. 


3.3.5 Rounding 

All arithmetic, rounding, and conversion instructions defined by the PowerPC architecture 
(except the optional Floating Reciprocal Estimate Single (fresv) and Floating Reciprocal 
Square Root Estimate (frsqrtev) instructions) produce an intermediate result considered to 
be infinitely precise and with unbounded exponent range. This intermediate result is 
normalized or denormalized if required, and then rounded to the destination format. The 
final result is then placed into the target FPR in the double-precision format or in fixed-point 
format, depending on the instruction. 

The IEEE-754 specification allows loss of accuracy to be defined as when the rounded 
result differs from the infinitely precise value with unbounded range (same as the definition 
of 'inexact’). In the PowerPC architecture, this is the way loss of accuracy is detected. 

Let Z be the intermediate arithmetic result (with infinite precision and unbounded range) or 
the operand of a conversion operation. If Z can be represented exactly in the target format, 
then the result in all rounding modes is exactly Z. If Z cannot be represented exactly in the 
target format, let Z1 and Z2 be the next larger and next smaller numbers representable in 
the target format that bound Z; then Z1 or Z2 can be used to approximate the result in the 
target format. 
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Figure 3-19 shows a graphical representation of Z, Zl, and Z2 in this case. 
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Negative values 
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Z 


Figure 3-19. Relation of Zl and Z2 

Four rounding modes are available through the floating-point rounding control field (RN) 
in the FPSCR. See Section 2.1.4, “Floating-Point Status and Control Register (FPSCR).” 
These are encoded as follows in Table 3-8. 

Table 3-8. FPSCR Bit Settings — RN Field 


RN 

Rounding Mode 

Rules 

00 

Round to nearest 

Choose the best approximation (Zl or Z2). In case of a tie, 
choose the one that is even (least-significant bit 0). 

01 

Round toward zero 

Choose the smaller in magnitude (Zl or Z2). 

10 

Round toward +infinity 

Choose Zl . 

11 

Round toward -infinity 

Choose Z2. 


Rounding occurs before an overflow condition is detected. This means that while an 
infinitely precise value with unbounded exponent range may be greater than the greatest 
representable value, the rounding mode may allow that value to be rounded to a 
representable value. In this case, no overflow condition occurs. 
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However, the underflow condition is tested before rounding. Therefore, if the value that is 
infinitely precise and with unbounded exponent range falls within the range of 
unrepresentable values, the underflow condition occurs. The results in these cases are 
defined in Section 3. 3. 6.2. 2, “Underflow Exception Condition.” Figure 3-20 shows the 
selection of Z1 and Z2 for the four possible rounding modes that are provided by 
FPSCRfRN], 


Z is infinitely precise 
result or operand 


Z fits 


otherwise 



Z>0 


^ frD <- Z2 ^ 


Figure 3-20. Selection of Z1 and Z2 for the Four Rounding Modes 

All arithmetic, rounding, and conversion instructions affect FPSCR bits FR and FT, 
according to whether the rounded result is inexact (FI) and whether the fraction was 
incremented (FR) as shown in Figure 3-21. If the rounded result is inexact, FI is set and FR 
may be either set or cleared. If rounding does not change the result, both FR and FI are 
cleared. The optional freSA and frsqrtex instructions set FI and FR to undefined values; 
other floating-point instructions do not alter FR and FI. 
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Figure 3-21. Rounding Flags in FPSCR 


3.3.6 Floating-Point Program Exceptions 

The computational instructions of the PowerPC architecture are the only instructions that 
can cause floating-point enabled exceptions (subsets of the program exception). In the 
processor, floating-point program exceptions are signaled by condition bits set in the 
floating-point status and control register (FPSCR) as described in this section and in 
Chapter 2, “PowerPC Register Set.” These bits correspond to those conditions identified as 
IEEE floating-point exceptions and can cause the system floating-point enabled exception 
error handler to be invoked. Handling for floating-point exceptions is described in 
Section 6.4.7, “Program Exception (0x00700).” 

The FPSCR is shown in Figure 3-22. 


j | Reserved 



0 1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 19 20 21 22 23 24 25 26 27 28 29 30 31 


Figure 3-22. Floating-Point Status and Control Register (FPSCR) 


3-28 


PowerPC Microprocessor Family: The Programming Environments 











A listing of FPSCR bit settings is shown in Table 3-9. 


Table 3-9. FPSCR Bit Settings 


Bit(s) 

Name 

Description 

0 

FX 

Floating-point exception summary. Every floating-point instruction, except mtfsfi and mtfsf, 
implicitly sets FPSCR[FX] if that instruction causes any of the floating-point exception bits in 
the FPSCR to transition from 0 to 1 . The mcrfs, mtfsfi, mtfsf, mtfsbO, and mtfsbl 
instructions can alter FPSCR[FX] explicitly. This is a sticky bit. 

1 

FEX 

Floating-point enabled exception summary. This bit signals the occurrence of any of the 
enabled exception conditions. It is the logical OR of all the floating-point exception bits masked 
by their respective enable bits (FEX = (VX & VE) A (OX & OE) A (UX & UE) A (ZX & ZE) A (XX 
& XE)). The mcrfs, mtfsf, mtfsfi, mtfsbO, and mtfsbl instructions cannot alter FPSCR[FEX] 
explicitly. This is not a sticky bit. 

2 

VX 

Floating-point invalid operation exception summary. This bit signals the occurrence of any 
invalid operation exception. It is the logical OR of all of the invalid operation exception bits as 
described in Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” The mcrfs, mtfsf, 
mtfsfi, mtfsbO, and mtfsbl instructions cannot alter FPSCR[VX] explicitly. This is not a sticky 
bit. 

3 

OX 

Floating-point overflow exception. This is a sticky bit. See Section 3.3. 6. 2, “Overflow, 

Underflow, and Inexact Exception Conditions.” 

4 

UX 

Floating-point underflow exception. This is a sticky bit. See Section 3. 3. 6.2.2, “Underflow 
Exception Condition.” 

5 

ZX 

Floating-point zero divide exception. This is a sticky bit. See Section 3.3.6. 1 .2, “Zero Divide 
Exception Condition.” 

6 

XX 

Floating-point inexact exception. This is a sticky bit. See Section 3. 3. 6.2. 3, “Inexact Exception 
Condition.” 

FPSCR[XX] is the sticky version of FPSCR[FI], The following rules describe how FPSCR[XX] 
is set by a given instruction: 

• If the instruction affects FPSCR[FI], the new value of FPSCR[XX] is obtained by logically 
ORing the old value of FPSCR[XX] with the new value of FPSCR[FI], 

• If the instruction does not affect FPSCR[FI], the value of FPSCR[XX] is unchanged. 

7 

VXSNAN 

Floating-point invalid operation exception for SNaN. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

8 

VXISI 

Floating-point invalid operation exception for °o - oo. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

9 

VXIDI 

Floating-point invalid operation exception for oo -r oo. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

10 

VXZDZ 

Floating-point invalid operation exception for 0 -r 0. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

11 

VXIMZ 

Floating-point invalid operation exception for °° * 0. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

12 

VXVC 

Floating-point invalid operation exception for invalid compare. This is a sticky bit. 

See Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

13 

FR 

Floating-point fraction rounded. The last arithmetic, rounding, or conversion instruction 
incremented the fraction. See Section 3.3.5, “Rounding.” This bit is not sticky. 
















































Table 3-9. FPSCR Bit Settings (Continued) 


Bit(s) 

Name 

Description 

14 

FI 

Floating-point fraction inexact. The last arithmetic, rounding, or conversion instruction either 
produced an inexact result during rounding or caused a disabled overflow exception. See 
Section 3.3.5, “Rounding.” This is not a sticky bit. For more information regarding the 
relationship between FPSCR[FI] and FPSCR[XX], see the description of the FPSCR[XX] bit. 

15-19 

FPRF 

Floating-point result flags. For arithmetic, rounding, and conversion instructions the field is 

based on the result placed into the target register, except that if any portion of the result is 

undefined, the value placed here is undefined. 

1 5 Floating-point result class descriptor (C). Arithmetic, rounding, and conversion 
instructions may set this bit with the FPCC bits to indicate the class of the result as 
shown in Table 3-10. 

16-19 Floating-point condition code (FPCC). Floating-point compare instructions always 

set one of the FPCC bits to one and the other three FPCC bits to zero. Arithmetic, 
rounding, and conversion instructions may set the FPCC bits with the C bit to 
indicate the class of the result. Note: In this case the high-order three bits of the 
FPCC retain their relational significance indicating that the value is less than, 
greater than, or equal to zero. 

1 6 Floating-point less than or negative (FL or <) 

1 7 Floating-point greater than or positive (FG or >) 

1 8 Floating-point equal or zero (FE or =) 

19 Floating-point unordered or NaN (FU or ?) 

Note: These are not sticky bits. 

20 

— 

Reserved 

21 

VXSOFT 

Floating-point invalid operation exception for software request. This is a sticky bit. This bit can 
be altered only by the mcrfs, mtfsfi, mtfsf, mtfsbO, or mtfsbl instructions. For more detailed 
information, refer to Section 3. 3. 6. 1.1, “Invalid Operation Exception Condition.” 

22 

VXSQRT 

Floating-point invalid operation exception for invalid square root. This is a sticky bit. For more 
detailed information, refer to Section 3.3. 6.1 .1 , “Invalid Operation Exception Condition.” 

23 

VXCVI 

Floating-point invalid operation exception for invalid integer convert. This is a sticky bit. See 
Section 3.3. 6. 1.1, “Invalid Operation Exception Condition.” 

24 

VE 

Floating-point invalid operation exception enable. See Section 3.3. 6.1 .1 , “Invalid Operation 
Exception Condition.” 

25 

OE 

IEEE floating-point overflow exception enable. See Section 3. 3. 6. 2, “Overflow, Underflow, and 
Inexact Exception Conditions.” 

26 

UE 

IEEE floating-point underflow exception enable. See Section 3. 3. 6. 2. 2, “Underflow Exception 
Condition.” 

27 

ZE 

IEEE floating-point zero divide exception enable. See Section 3. 3. 6. 1.2, “Zero Divide 

Exception Condition.” 

28 

XE 

Floating-point inexact exception enable. See Section 3. 3. 6. 2.3, “Inexact Exception Condition.” 








































Table 3-9. FPSCR Bit Settings (Continued) 


Bit(s) 

Name 

Description 

29 

Nl 

Floating-point non-IEEE mode. If this bit is set, results need not conform with IEEE standards 
and the other FPSCR bits may have meanings other than those described here. If the bit is set 
and if all /mplementation-specific requirements are met and if an lEEE-conforming result of a 
floating-point operation would be a denormalized number, the result produced is zero 
(retaining the sign of the denormalized number). Any other effects associated with setting this 
bit are described in the user’s manual for the implementation. 

Effects of the setting of this bit are implementation-dependent. 

30-31 

RN 

Floating-point rounding control. See Section 3.3.5, “Rounding.’’ 

00 Round to nearest 

01 Round toward zero 

10 Round toward +infinity 

1 1 Round toward -infinity 


Table 3-10 illustrates the floating-point result flags used by PowerPC processors. The result 
flags correspond to FPSCR bits 15-19 (the FPRF field). 


Table 3-10. Floating-Point Result Flags — FPSCR[FPRF] 


Result Flags (Bits 15-19) 

Result Value Class 

C 

< 

> 

B 

B 

1 

0 

0 

0 

1 

Quiet NaN 

0 

1 

0 

0 

1 

-Infinity 

0 

1 

0 

0 

0 

-Normalized number 

1 

1 

0 

0 

0 

-Denormalized number 

1 

0 

0 

1 

0 

-Zero 

0 

0 

0 

1 

0 

+Zero 

1 

0 

1 

0 

0 

+Denormalized number 

0 

0 

1 

0 

0 

+Normalized number 

0 

0 

1 

0 

1 

+lnfinity 


The following conditions that can cause program exceptions are detected by the processor. 
These conditions may occur during execution of computational floating-point instructions. 
The corresponding bits set in the FPSCR are indicated in parentheses: 

• Invalid operation exception condition (VX) 

— SNaN condition (VXSNAN) 

— Infinity - infinity condition (VXISI) 

— Infinity ^infinity condition (VXIDI) 

— Zero -fzero condition (VXZDZ) 

— Infinity * zero condition (VXIMZ) 
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— Invalid compare condition (VXVC) 

— Software request condition (VXSOFT) 

— Invalid integer convert condition (VXCVI) 

— Invalid square root condition (VXSQRT) 

These exception conditions are described in Section 3. 3. 6. 1.1, '‘Invalid Operation 
Exception Condition.” 

• Zero divide exception condition (ZX). These exception conditions are described in 
Section 3.3. 6. 1.2, “Zero Divide Exception Condition.” 

• Overflow Exception Condition (OX). These exception conditions are described in 
Section 3.3. 6.2.1, “Overflow Exception Condition.” 

• Underflow Exception Condition (UX). These exception conditions are described in 
Section 3.3. 6.2.2, “Underflow Exception Condition.” 

• Inexact Exception Condition (XX). These exception conditions are described in 
Section 3.3. 6.2.3, “Inexact Exception Condition.” 

Each floating-point exception condition and each category of invalid IEEE floating-point 
operation exception condition has a corresponding exception bit in the FPSCR which 
indicates the occurrence of that condition. Generally, the occurrence of an exception 
condition depends only on the instruction and its arguments (with one deviation, described 
below). When one or more exception conditions arise during the execution of an 
instruction, the way in which the instruction completes execution depends on the value of 
the IEEE floating-point enable bits in the FPSCR which govern those exception conditions. 
If no governing enable bit is set to 1, the instruction delivers a default result. Otherwise, 
specific condition bits and the FX bit in the FPSCR are set and instruction execution is 
completed by suppressing or delivering a result. Finally, after the instruction execution has 
completed, a nonzero FX bit in the FPSCR causes a program exception if either FEO or FE1 
is set in the MSR (invoking the system error handler). The values in the FPRs immediately 
after the occurrence of an enabled exception do not depend on the FEO and FE1 bits. 

The floating-point exception summary bit (FX) in the FPSCR is set by any floating-point 
instruction (except mtfsfi and mtfsf) that causes any of the exception bits in the FPSCR to 
change from 0 to 1, or by mtfsfi, mtfsf, and mtfsbl instructions that explicitly set one of 
these bits. FPSCR[FEX] is set when any of the exception condition bits is set and the 
exception is enabled (enable bit is one). 

A single instruction may set more than one exception condition bit only in the following 
cases: 

• The inexact exception condition bit (FPSCR[XX]) may be set with the overflow 
exception condition bit (FPSCR[OX]). 

• The inexact exception condition bit (FPSCR[XX]) may be set with the underflow 
exception condition bit (FPSCR[UX]). 
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• The invalid IEEE floating-point operation exception condition bit (SNaN) may be 
set with invalid IEEE floating-point operation exception condition bit (°° *0) 
(FPSCR[VXIMZ]) for multiply-add instructions. 

• The invalid operation exception condition bit (SNaN) may be set with the invalid 
IEEE floating-point operation exception condition bit (invalid compare) 
(FPRSC[VXVC]) for compare ordered instructions. 

• The invalid IEEE floating-point operation exception condition bit (SNaN) may be 
set with the invalid IEEE floating-point operation exception condition bit (invalid 
integer convert) (FPSCR[VXCVI]) for convert-to-integer instructions. 

Instruction execution is suppressed for the following kinds of exception conditions, so that 
there is no possibility that one of the operands is lost: 

• Enabled invalid IEEE floating-point operation 

• Enabled zero divide 

For the remaining kinds of exception conditions, a result is generated and written to the 
destination specified by the instruction causing the exception condition. The result may 
depend on whether the condition is enabled or disabled. The kinds of exception conditions 
that deliver a result are the following: 

• Disabled invalid IEEE floating-point operation 

• Disabled zero divide 

• Disabled overflow 

• Disabled underflow 

• Disabled inexact 

• Enabled overflow 

• Enabled underflow 

• Enabled inexact 

Subsequent sections define each of the floating-point exception conditions and specify the 
action taken when they are detected. 

The IEEE standard specifies the handling of exception conditions in terms of traps and trap 
handlers. In the PowerPC architecture, an FPSCR exception enable bit being set causes 
generation of the result value specified in the IEEE standard for the trap enabled case — the 
expectation is that the exception is detected by hardware which will notify software by 
taking an exception (trap). The software exception handler will revise the result. An FPSCR 
exception enable bit of 0 causes generation of the default result value specified for the trap 
disabled (or no trap occurs or trap is not implemented) case — the expectation is that the 
exception will not be detected by software (because the hardware doesn’t trap or take the 
exception), which will simply use the default result. The result to be delivered in each case 
for each exception is described in the following sections. 
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The IEEE default behavior when an exception occurs, which is to generate a default value 
and not to notify software, is obtained by clearing all FPSCR exception enable bits and 
using ignore exceptions mode (see Table 3-11). In this case the system floating-point 
enabled exception error handler is not invoked, even if floating-point exceptions occur. If 
necessary, software can inspect the FPSCR exception bits to determine whether exceptions 
have occurred. 

If the system error handler is to be invoked, the corresponding FPSCR exception enable bit 
must be set and a mode other than ignore exceptions mode must be used. In this case the 
system floating-point enabled exception error handler is invoked if an enabled floating- 
point exception condition occurs. 

Whether and how the system floating-point enabled exception error handler is invoked if an 
enabled floating-point exception occurs is controlled by MSR bits FEO and FE1 as shown 
in Table 3-11. (The system floating-point enabled exception error handler is never invoked 
if the appropriate floating-point exception is disabled.) 


Table 3-11. MSR[FE0] and MSR[FE1] Bit Settings for FP Exceptions 


FEO 

FE1 

Description 

0 

0 

Ignore exceptions mode — Floating-point exceptions do not cause the program exception error 
handler to be invoked. 

0 

1 

Imprecise nonrecoverable mode — When an exception occurs, the exception handler is invoked at 
some point at or beyond the instruction that caused the exception. It may not be possible to identify 
the excepting (offending) instruction or the data that caused the exception. Results from the 
excepting instruction may have been used by or affected subsequent instructions executed before the 
exception handler was invoked. 

1 

0 

Imprecise recoverable mode — When an enabled exception occurs, the floating-point enabled 
exception handler is invoked at some point at or beyond the instruction that caused the exception. 
Sufficient information is provided to the exception handler that it can identify the excepting (offending) 
instruction and correct any faulty results. In this mode, no results caused by the excepting instruction 
have been used by or affected subsequent instructions that are executed before the exception 
handler is invoked. Running in this mode may cause degradation in performance 

1 

1 

Precise mode — The system floating-point enabled exception error handler is invoked precisely at the 
instruction that caused the enabled exception. Running in this mode may cause degradation in 
performance. 


In precise mode, whenever the system floating-point enabled exception error handler is 
invoked, the architecture ensures that all instructions logically residing before the excepting 
instruction have completed and no instruction after the excepting instruction has been 
executed. In an imprecise mode, the instruction flow may not be interrupted at the point of 
the instruction that caused the exception. The instruction at which the system floating-point 
exception handler is invoked has not been executed unless it is the excepting instruction and 
the exception is not suppressed. 

In either of the imprecise modes, any FPSCR instruction can be used to force the 
occurrence of any invocations of the floating-point enabled exception handler, due to 
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instructions initiated before the FPSCR instruction. This forcing has no effect in ignore 
exceptions mode and is superfluous for precise mode. 

Instead of using an FPSCR instruction, an execution synchronizing instruction or event can 
be used to force exceptions and set bits in the FPSCR; however, for the best performance 
across the widest range of implementations, an FPSCR instruction should be used to 
achieve these effects. 

For the best performance across the widest range of implementations, the following 
guidelines should be considered: 

• If IEEE default results are acceptable to the application, FEO and FE1 should be 
cleared (ignore exceptions mode). All FPSCR exception enable bits should be 
cleared. 

• If IEEE default results are unacceptable to the application, an imprecise mode 
should be used with the FPSCR enable bits set as needed. 

• Ignore exceptions mode should not, in general, be used when any FPSCR exception 
enable bits are set. 

• Precise mode may degrade performance in some implementations, perhaps 
substantially, and therefore should be used only for debugging and other specialized 
applications. 

3.3.6. 1 Invalid Operation and Zero Divide Exception Conditions 

The flow diagram in Figure 3-23 shows the initial flow for checking floating-point 
exception conditions (invalid operation and divide by zero conditions). In any of these cases 
of floating-point exception conditions, if the FPSCR[FEX] bit is set (implicitly) and 
MSR[ FEO-FE 1 | A)0, the processor takes a program exception (floating-point enabled 
exception type). Refer to Chapter 6, “Exceptions,” for more information on exception 
processing. The actions performed for each floating-point exception condition are 
described in greater detail in the following sections. 
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Figure 3-23. Initial Flow for Floating-Point Exception Conditions 
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3. 3. 6. 1.1 Invalid Operation Exception Condition 

An invalid operation exception occurs when an operand is invalid for the specified 
operation. The invalid operations are as follows: 

• Any operation except load, store, move, select, or mtfsf on a signaling NaN (SNaN) 

• For add or subtract operations, magnitude subtraction of infinities ( oo — oo) 

• Division of infinity by infinity ( oo -j- OO ) 

• Division of zero by zero (0 -f 0) 

• Multiplication of infinity by zero (oo * 0) 

• Ordered comparison involving a NaN (invalid compare) 

• Square root or reciprocal square root of a negative, nonzero number (invalid square 
root). 

NOTE: If the implementation does not support the optional floating-point square 
root or floating-point reciprocal square root estimate instructions, software 
can simulate the instruction and set the FPSCRfVXSQRT] bit to reflect the 
exception. 

• Integer convert involving a number that is too large in magnitude to be represented 
in the target format, or involving an infinity or a NaN (invalid integer convert) 

FPSCRfVXSOFT] allows software to cause an invalid operation exception for a condition 
that is not necessarily associated with the execution of a floating-point instruction. For 
example, it might be set by a program that computes a square root if the source operand is 
negative. This allows PowerPC instructions not implemented in hardware to be emulated. 


Any time an invalid operation occurs or software explicitly requests the exception via 
FPSCRfVXSOFT], (regardless of the value of FPSCRfVE]), the following actions are 
taken: 


One or two invalid operation 

FPSCRfVXSNAN] 

FPSCRfVXISI] 

FPSCRfVXIDI] 

FPSCRfVXZDZ] 

FPSCRfVXIMZ] 

FPSCRfVXVC] 

FPSCRfVXSOFT] 

FPSCRfVXSQRT] 

FPSCRfVXCVI] 


exception condition bits is set 
(if SNaN) 

(if 00—00 ) 

(if oo-i-oo ) 

(if 0 -0) 

(if oo* 0) 

(if invalid comparison) 

(if software request) 

(if invalid square root) 

(if invalid integer convert) 


• If the operation is a compare, 

FPSCRfFR, FI, C] are unchanged 
FPSCRfFPCC] is set to reflect unordered 


• If software explicitly requests the exception, 

FPSCRfFR, FI, FPRF] are as set by the mtfsfi, mtfsf, or mtfsbl instruction. 


Chapter 3. Operand Conventions 


3-37 



There are additional actions performed that depend on the value of FPSCRfVE], These are 
described in Table 3-12. 


Table 3-12. Additional Actions Performed for Invalid FP Operations 


Invalid Operation 

Result Category 

Action Performed 

FPSCR[VE] = 1 

FPSCR[VE] = 0 

Arithmetic or floating-point round 
to single 

frD 

Unchanged 

QNaN 

FPSCR[FR, FI] 

Cleared 

Cleared 

FPSCR[FPRF] 

Unchanged 

Set for QNaN 

Convert to 32-bit integer 
(positive number or + °° ) 

frD[0— 31 ] 

Unchanged 

Undefined 

f rD[32— 63] 

Unchanged 

Most positive 32-bit 
integer value 

FPSCR[FR, FI] 

Cleared 

Cleared 

FPSCR[FPRF] 

Unchanged 

Undefined 

Convert to 32-bit integer 
(negative number, NaN, or - °° ) 

frD[0— 31 ] 

Unchanged 

Undefined 

f rD[32— 63] 

Unchanged 

Most negative 32-bit 
integer value 

FPSCR[FR, FI] 

Cleared 

Cleared 

FPSCR[FPRF] 

Unchanged 

Undefined 

All cases 

FPSCR[FEX] 

Implicitly set 
(causes exception) 

Unchanged 


3. 3. 6. 1.2 Zero Divide Exception Condition 

A zero divide exception condition occurs when a divide instruction is executed with a zero 
divisor value and a finite, nonzero dividend value or when an fres or frsqrte instruction is 
executed with a zero operand value. This exception condition indicates an exact infinite 
result from finite operands exception condition corresponding to a mathematical pole 
(divide or fres) or a branch point singularity (frsqrte). 
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When a zero divide condition occurs, the following actions are taken: 

• Zero divide exception condition bit is set FPSCRfZX] = 1 . 

• FPSCRfFR, FI] are cleared. 

Additional actions depend on the setting of the zero divide exception condition enable bit, 
FPSCRfZE], as described in Table 3-13. 


Table 3-13. Additional Actions Performed for Zero Divide 


Result Category 

Action Performed 

FPSCR[ZE] = 1 

FPSCR[ZE] = 0 

frD 

Unchanged 

(sign detanined by XOR of the 
signs of the operands) 

FPSCR[FEX] 

Implicitly set (causes exception) 

Unchanged 

FPSCR[FPRF] 

Unchanged 

Set to indicate 


3. 3. 6. 2 Overflow, Underflow, and Inexact Exception Conditions 

As described earlier, the overflow, underflow, and inexact exception conditions are detected 
after the floating-point instruction has executed and an infinitely precise result with 
unbounded range has been computed. Figure 3-24 shows the flow for the detection of these 
conditions and is a continuation of Figure 3-23. As in the cases of invalid operation, or zero 
divide conditions, if the FPSCRfFEX] bit is implicitly set as described in Table 3-9 and 
MSR[FE0-FE1] * 00, the processor takes a program exception (floating-point enabled 
exception type). Refer to Chapter 6, “Exceptions,” for more information on exception 
processing. The actions performed for each of these floating-point exception conditions 
(including the generated result) are described in greater detail in the following sections. 
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Figure 3-24. Checking of Remaining Floating-Point Exception Conditions 
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3. 3. 6. 2.1 Overflow Exception Condition 

Overflow occurs when the magnitude of what would have been the rounded result (had the 
exponent range been unbounded) is greater than the magnitude of the largest finite number 
of the specified result precision. Regardless of the setting of the overflow exception 
condition enable bit of the FPSCR, the following action is taken: 

• The overflow exception condition bit is set FPSCR[OX] = 1. 

Additional actions are taken that depend on the setting of the overflow exception condition 
enable bit of the FPSCR as described in Table 3-14. 


Table 3-14. Additional Actions Performed for Overflow Exception Condition 


Condition 

Result Category 

Action Performed 

FPSCR[OE] = 1 

FPSCR[OE] = 0 

Double-precision 
arithmetic instructions 

Exponent of normalized 
intermediate result 

Adjusted by subtracting 1536 

— 

Single-precision 
arithmetic and frspx 
instruction 

Exponent of normalized 
intermediate result 

Adjusted by subtracting 192 


All cases 

frD 

Rounded result (with adjusted 
exponent) 

Default result per Table 3-15 


FPSCR[XX] 

Set if rounded result differs 
from intermediate result 

Set 


FPSCR[FEX] 

Implicitly set (causes 
exception) 

Unchanged 


FPSCR[FPRF] 

Set to indicate±normal number 

Set to indicate ± or namal 
Fnumber 


FPSCR[FI] 

Reflects rounding 

Set 


FPSCR[FR] 

Reflects rounding 

Undefined 


When the overflow exception condition is disabled (FPSCR[OE] = 0) and an overflow 
condition occurs, the default result is determined by the rounding mode bit (FPSCR[RN]) 
and the sign of the intermediate result as shown in Table 3-15. 
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Table 3-15. Target Result for Overflow Exception Disabled Case 


FPSCR[RN] 

Sign of Intermediate 
Result 

frD 

Round to nearest 

Positive 

-(-Infinity 

Negative 

-Infinity 

Round toward zero 

Positive 

Format’s largest finite positive number 

Negative 

Format’s most negative finite number 

Round toward +infinity 

Positive 

-(-Infinity 

Negative 

Format’s most negative finite number 

Round toward -infinity 

Positive 

Format’s largest finite positive number 

Negative 

-Infinity 


3. 3. 6. 2. 2 Underflow Exception Condition 

The underflow exception condition is defined separately for the enabled and disabled states: 

• Enabled — Underflow occurs when the intermediate result is tiny. 

• Disabled — Underflow occurs when the intermediate result is tiny and the rounded 
result is inexact. 

In this context, the term ‘tiny’ refers to a floating-point value that is too small to be 
represented for a particular precision format. 

As shown in Figure 3-24, a tiny result is detected before rounding, when a nonzero 
intermediate result value computed as though it had infinite precision and unbounded 
exponent range is less in magnitude than the smallest normalized number. 

If the intermediate result is tiny and the underflow exception condition enable bit is cleared 
(FPSCRfUE] = 0), the intermediate result is denormalized (see Section 3.3.3, 
“Normalization and Denormalization”) and rounded (see Section 3.3.5, “Rounding”) 
before being stored in an FPR. In this case, if the rounding causes the delivered result value 
to differ from what would have been computed were both the exponent range and precision 
unbounded (the result is inexact), then underflow occurs and FPSCRfUX] is set. 
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The actions performed for underflow exception conditions are described in Table 3-16. 


Table 3-16. Actions Performed for Underflow Conditions 


Condition 

Result Category 

Action Performed 

FPSCR[UE] = 1 

FPSCR[UE] = 0 

Double-precision 
arithmetic instructions 

Exponent of normalized 
intermediate result 

Adjusted by adding 1536 

— 

Single-precision 
arithmetic and frspx 
instructions 

Exponent of normalized 
intermediate result 

Adjusted by adding192 


All cases 

frD 

Rounded result (with 
adjusted exponent) 

Denormalized and 
rounded result 


FPSCR[XX] 

Set if rounded result 
differs from intermediate 
result 

Set if rounded result 
differs from intermediate 
result 


FPSCR[UX] 

Set 

Set only if tiny and inexact 
after denormalization and 
rounding 


FPSCR[FPRF] 

Set to indicate nomalized 
number 

Set to indicate 
±denormalized number or 
±zero 


FPSCR[FEX] 

Implicitly set (causes 
exception) 

Unchanged 


FPSCR[FI] 

Reflects rounding 

Reflects rounding 


FPSCR[FR] 

Reflects rounding 

Reflects rounding 


NOTE: The FR and FI bits in the FPSCR allow the system floating-point enabled 
exception error handler, when invoked because of an underflow exception 
condition, to simulate a trap disabled environment. 

That is, the FR and FI bits allow the system floating-point enabled exception 
error handler to unround the result, thus allowing the result to be denormalized. 

3. 3. 6. 2. 3 Inexact Exception Condition 

The inexact exception condition occurs when one of two conditions occur during rounding: 

• The rounded result differs from the intermediate result assuming the intermediate 
result exponent range and precision to be unbounded. (In the case of an enabled 
overflow or underflow condition, where the exponent of the rounded result is 
adjusted for those conditions, an inexact condition occurs only if the significand of 
the rounded result differs from that of the intermediate result.) 

• The rounded result overflows and the overflow exception condition is disabled. 
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When an inexact exception condition occurs, the following actions are taken independently 
of the setting of the inexact exception condition enable bit of the FPSCR: 

• Inexact exception condition bit in the FPSCR is set FPSCR[XX] = 1. 

• The rounded or overflowed result is placed into the target FPR. 

• FPSCR[FPRF] is set to indicate the class and sign of the result. 

In addition, if the inexact exception condition enable bit in the FPSCR (FPSCR[XE]) is set, 
and an inexact condition exists, then the FPSCR[FEX] bit is implicitly set, causing the 
processor to take a floating-point enabled program exception. 

In PowerPC implementations, running with inexact exception conditions enabled may have 
greater latency than enabling other types of floating-point exception conditions. 
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Chapter 4. Addressing Modes and 
Instruction Set Summary 


This chapter describes instructions and addressing modes defined by the three levels of the 
PowerPC architecture — user instruction set architecture (UISA), virtual environment 
architecture (VEA), and operating environment architecture (OEA). These instructions are 
divided into the following functional categories: 

• Integer instructions — These include arithmetic and logical instructions. For more 
information, see Section 4.2.1, “Integer Instructions.” 

• Floating-point instructions — These include floating-point arithmetic instructions, as 
well as instructions that affect the floating-point status and control register (FPSCR). 
For more information, see Section 4.2.2, “Floating-Point Instructions.” 

• Foad and store instructions — These include integer and floating-point load and store 
instructions. For more information, see Section 4.2.3, “Foad and Store Instructions.” 

• Flow control instructions — These include branching instructions, condition register 
logical instructions, trap instructions, and other instructions that affect the 
instruction flow. For more information, see Section 4.2.4, “Branch and Flow Control 
Instructions.” 



• Processor control instructions — These instructions are used for synchronizing 
memory accesses and managing of caches, TFBs, and the segment registers. For 
more information, see Section 4.2.5, “Processor Control Instructions — UISA,” 
Section 4.3.1, “Processor Control Instructions — VEA,” and Section 4.4.2, 
“Processor Control Instructions — OEA.” 


• Memory synchronization instructions — These instructions control the order in 
which memory operations are completed with respect to asynchronous events, and 
the order in which memory operations are seen by other processors or memory 
access mechanisms. For more information, see Section 4.2.6, “Memory 
Synchronization Instructions — UISA,” and Section 4.3.2, “Memory 
Synchronization Instructions — VEA.” 

• Memory control instructions — These include cache management instructions (user- 
level and supervisor-level), segment register manipulation instructions, and 
translation lookaside buffer management instructions. For more information, see 
Section 4.3.3, “Memory Control Instructions — VEA,” and Section 4.4.3, “Memory 
Control Instructions — OEA.” 
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NOTE: User-level and supervisor-level are referred to as problem state and privileged 
state, respectively, in the architecture specification. 

• External control instructions — These instructions allow a user-level program to 
communicate with a special-purpose device. For more information, see 
Section 4.3.4, “External Control Instructions.” 




This grouping of instructions does not necessarily indicate the execution unit that processes 
a particular instruction or group of instructions within a processor implementation. 

Integer instructions operate on byte, half-word, and word operands. Floating-point 
instructions operate on single-precision and double-precision floating-point operands. The 
PowerPC architecture uses instructions that are four bytes long and word-aligned. It 
provides for byte, half-word, and word operand fetches and stores between memory and a 
set of 32 general-purpose registers (GPRs). It also provides for word and double-word 
operand fetches and stores between memory and a set of 32 floating-point registers (FPRs). 
The FPRs are 64 bits wide in all PowerPC implementations. The GPRs are 32 bits wide. 


Arithmetic and logical instructions do not read or modify memory. To use the contents of 
a memory location in a computation and then modify the same or another memory location, 
the memory contents must be loaded into a register, modified, and then written to the target 
location using load and store instructions. 


The description of each instruction includes the mnemonic and a formatted list of operands. 
PowerPC-compliant assemblers support the mnemonics and operand lists. To simplify 
assembly language programming, a set of simplified mnemonics (referred to as extended 
mnemonics in the architecture specification) and symbols is provided for some of the most 
frequently-used instructions; see Appendix F, “Simplified Mnemonics,” for a complete list 
of simplified mnemonics. 


The instructions are organized by functional categories while maintaining the delineation 
— of the three levels of the PowerPC architecture — UISA, VEA, and OEA; Section 4.2 
! ! discusses the UISA instructions, followed by Section 4.3 that discusses the VEA 

V instructions and Section 4.4 that discusses the OEA instructions. See Section 1.1.2, “.The 

O Levels of the PowerPC Architecture,” for more information about the various levels defined 
by the PowerPC architecture. 


4.1 Conventions 

This section describes conventions used for the PowerPC instruction set. Descriptions of 
computation modes, memory addressing, synchronization, and the PowerPC exception 
summary follow. 

4.1.1 Sequential Execution Model 

The PowerPC processors appear to execute instructions in program order, regardless of 
asynchronous events or program exceptions. The execution of a sequence of instructions 
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may be interrupted by an exception caused by one of the instructions in the sequence, or by 
an asynchronous event. 

NOTE: The architecture specification refers to exceptions as interrupts. 

For exceptions to the sequential execution model, refer to Chapter 6, “Exceptions.” For 
information about the synchronization required when using store instructions to access 
instruction areas of memory, refer to Section 4.2. 3.3, “Integer Store Instructions,” and 
Section 5. 1.5. 2, “Instruction-Cache Instructions.” For information regarding instruction 
fetching, and for information about guarded memory refer to Section 5. 2. 1.5, “The 
Guarded Attribute (G).” 

4.1.2 Computation Modes 

The PowerPC architecture allows for both 32-bit and 64-bit modes, however, this manual 
defines only the 32-bit implementation, in which all registers except the FPRs are 32 bits 
long, and effective addresses are always 32 bits long. 

4.1.3 Classes of Instructions 

PowerPC instructions belong to one of the following three classes: 

• Defined 

• Illegal 

• Reserved 

The class is determined by examining the primary opcode, and the extended opcode if any. 
If the opcode, or the combination of opcode and extended opcode, is not that of a defined 
instruction or of a reserved instruction, the instruction is illegal. 

In future versions of the PowerPC architecture, instruction codings that are now illegal may 
become defined (by being added to the architecture) or reserved (by being assigned to one 
of the special purposes). Likewise, reserved instructions may become defined. 

4. 1.3.1 Definition of Boundedly Undefined 

The results of executing a given instruction are said to be boundedly undefined if they could 
have been achieved by execution an arbitrary sequence of instructions, stating in the state 
the machine was in before execution the given instruction. Boundedly undefined results for 
a given instruction may vary between implementations, and between different executions 
on a the same implementations. 

4. 1.3. 2 Defined Instruction Class 

Defined instructions contain all the instructions defined in the PowerPC UISA, VEA, and 
OEA. Defined instructions are guaranteed to be supported in all PowerPC implementations 
as stated in the instruction descriptions in Chapter 8, “Instruction set.” A PowerPC 
processor may invoke the illegal instruction error handler (part of the program exception 
handler) when an unimplemented PowerPC instruction is encountered so that it may be 
emulated in software, as required. 

A defined instruction can have invalid forms, as described in Section 4. 1.3. 2. 2, “Invalid 
Instruction Forms.” 
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4. 1.3. 2.1 Preferred Instruction Forms 

A defined instruction may have an instruction form that is preferred (that is, the instruction 
will execute in an efficient manner). Any form other than the preferred form may take 
significantly longer to execute. The following instructions have preferred forms: 

• Load/store multiple instructions 

• Load/store string instructions 

• Or immediate instruction (preferred form of no-op) 

4. 1.3. 2. 2 Invalid Instruction Forms 

A defined instruction may have an instruction form that is invalid if one or more operands, 
excluding opcodes, are coded incorrectly in a manner that can be deduced by examining 
only the instruction encoding (primary and extended opcodes). Attempting to execute an 
invalid form of an instruction either invokes the illegal instruction error handler (a program 
exception) or yields boundedly-undefined results. See Chapter 8, “Instruction set,” for 
individual instruction descriptions. 

Invalid forms result when a bit or operand is coded incorrectly, for example, or when a 
reserved bit (shown as ‘0’) is coded as ‘1’. 

The following instructions have invalid forms identified in their individual instruction 
descriptions: 

• Branch conditional instructions 

• Load/store with update instructions 

• Load multiple instructions 

• Load string instructions 

• Integer compare instructions 

• Load/store floating-point with update instructions 

4. 1.3. 2. 3 Optional Instructions 

A defined instruction may be optional. The optional instructions fall into the following 
categories: 

• General-purpose instructions — fsqrt and fsqrts 

• Graphics instructions — fres, frsqrte, and fsel 

• External control instructions — eciwx and ecowx 

• Lookaside buffer management instructions — tibia, tlbie, and tlbsync (with 
conditions, see Chapter 8, “Instruction set,” for more information) 

NOTE: The stfiwx instruction is defined as optional by the PowerPC architecture to 

ensure backwards compatibility with earlier processors; however, it will likely be 
required for subsequent PowerPC processors. 

Additional categories may be defined in future implementations. If an 
implementation claims to support a given category, it implements all the 
instructions in that category. 
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Any attempt to execute an optional instruction that is not provided by the implementation 
will cause the illegal instruction error handler to be invoked. Exceptions to this rule are 
stated in the instruction descriptions found in Chapter 8, “Instruction set.” 

4. 1.3. 3 Illegal Instruction Class 

Illegal instructions can be grouped into the following categories: 

• Instructions that are not implemented in the PowerPC architecture. These opcodes 
are available for future extensions of the PowerPC architecture; that is, future 
versions of the PowerPC architecture may define any of these instructions to 
perform new functions. The following primary opcodes are defined as illegal but 
may be used in future extensions to the architecture: 

1, 2, 4, 5, 6, 22, 30, 56, 57, 58, 60, 61, 62 

• All unused extended opcodes are illegal. The unused extended opcodes can be 
determined from information in Section A. 2, “Instructions Sorted by Opcode,” and 
Section 4. 1.3. 4, “Reserved Instructions.” The following primary opcodes have some 
unused extended opcodes. 

19,31,59, 63 

• An instruction consisting entirely of zeros is guaranteed to be an illegal instruction. 
This increases the probability that an attempt to execute data or uninitialized 
memory invokes the illegal instruction error handler (a program exception). 

NOTE: If only the primary opcode consists of all zeros, the instruction is considered a 
reserved instruction, as described in Section 4. 1.3. 4, “Reserved Instructions.” 

An attempt to execute an illegal instruction invokes the illegal instruction error handler (a 
program exception) but has no other effect. See Section 6.4.7, “Program Exception 
(0x00700),” for additional information about illegal instruction exception. 

With the exception of the instruction consisting entirely of binary zeros, the illegal 
instructions are available for further additions to the PowerPC architecture. 

4. 1.3. 4 Reserved Instructions 

Reserved instructions are allocated to specific implementation-dependent purposes not 
defined by the PowerPC architecture. An attempt to execute an unimplemented reserved 
instruction invokes the illegal instruction error handler (a program exception). See 
Section 6.4.7, “Program Exception (0x00700),” for additional information about illegal 
instruction exception. 

The following types of instructions are included in this class: 

1 . Instructions for the POWER architecture that have not been included in the 
PowerPC architecture. 
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2. Implementation- specific instructions used to conform to the PowerPC 
architecture specifications (for example. Load Data TLB Entry (tlbld) and 
Load Instruction TLB Entry (tlbli) instructions for the PowerPC 603™ 
microprocessor). 

3 . The instruction with primary opcode 0, when the instruction does not consist 
entirely of binary zeros 

4. Any other implementation- specific instructions that are not defined in the UISA, 
VEA, or OEA 

4.1.4 Memory Addressing 

A program references memory using the effective (logical) address computed by the 
processor when it executes a load, store, branch, or cache instruction, and when it fetches 
the next sequential instruction. 

4. 1.4.1 Memory Operands 

Bytes in memory are numbered consecutively starting with zero. Each number is the 
address of the corresponding byte. Within words bytes are number from left to right. 

Memory operands may be bytes, half-words, words, or double words, for the load/store 
multiple, and load/store string instructions a sequence of bytes or words. The address of a 
memory operand is the address of its first byte (that is, of its lowest-numbered byte). 
Operand length is implicit for each instruction. The PowerPC architecture supports both 
big-endian and little-endian byte ordering. The default byte and bit ordering is big-endian; 
see Section 3.1.2, “Byte Ordering,” for more information. 

The operand of a single-register memory access instruction has a natural alignment 
boundary equal to the operand length. In other words, the “natural” address of an operand 
is an integral multiple of the operand length. A memory operand is said to be aligned if it 
is aligned at its natural boundary; otherwise it is misaligned. For a detailed discussion about 
memory operands, see Chapter 3, “Operand Conventions.” 

4. 1.4. 2 Effective Address Calculation 


An effective address (EA) is the 32-bit sum computed by the processor when executing a 
memory access or branch instruction or when fetching the next sequential instruction. For 
a memory access instruction, if the sum of the effective address and the operand length 
exceeds the maximum effective address, the memory operand is considered to wrap around 
from the maximum effective address through effective address 0, as described in the 
following paragraphs. 

Effective address computations for both data and instruction accesses use 32-bit unsigned 
binary arithmetic. A carry from bit 0 is ignored. The effective address arithmetic wraps 
around from the maximum address, 2 - 1, to address 0. 

In all implementations, the three low-order bits of the calculated effective address may be 
modified by the processor before accessing memory if the PowerPC system is operating in 
little-endian mode. 

See Section 3.1.2, “Byte Ordering,” for more information about little-endian mode. 
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Load and store operations have three categories of effective address generation that depend 
on the operands specified: 

• Register indirect with immediate index mode 

• Register indirect with index mode (sum of two registers) 

• Register indirect mode 

See Section 4.2.3. 1, “Integer Load and Store Address Generation,” for a detailed 
description of effective address generation for load and store operations. 



Branch instructions have three categories of effective address generation: 

• Immediate addressing. 

• Link register indirect 

• Count register indirect 

See Section 4.2.4. 1, “Branch Instruction Address Calculation,” for a detailed 
description of effective address generation for branch instructions. 

Branch instructions can optionally load the LR with the next sequential instruction address 
(current instruction address + 4). This is used for subroutine call and return. 


4.1.5 Synchronizing Instructions 

The synchronization described in this section refers to the state of activities within the 
processor that is performing the synchronization. Refer to Section 6.1.2, 
“Synchronization,” for more detailed information about other conditions that can cause 
context and execution synchronization. 


4. 1.5.1 Context Synchronizing Instructions 

The System Call (sc), Return from Interrupt (rfi), and Instruction Synchronize (isync) 
instructions perform context synchronization by allowing previously issued instructions to 
complete before continuing with program execution. All three instructions will flush the 
instruction prefetch queue and start instruction fetching from memory in the context 
established after all preceding instructions have completed execution. 

1. No higher priority exception exists (sc) and instruction fetching and dispatching is 
halted. 

2. All previous instructions have completed to a point where they can no longer cause 
an exception. 

If a previous memory access instruction causes one or more direct-store interface 
error exceptions, the results are guaranteed to be determined before this instruction 
is executed. 



3. Previous instructions complete execution in the context (privilege, protection, and 
address translation) under which they were issued. 
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4. The instructions at the target of the branch of sc and rfi and those following the isync 
instruction execute in the context established by these instructions. For the isync 
instruction the instruction fetch queue must be flushed and instruction fetching 
restarted at the next sequential instruction. Both sc and rfi execute like a branch and 
the flushing and refetching is automatic. 

4. 1.5. 2 Execution Synchronizing Instructions 

An instruction is execution synchronizing if it satisfies the conditions of the first two items 
described above for context synchronization. The sync instruction is treated like isync with 
respect to the second item described above (that is, the conditions described in the second 
item apply to the completion of sync). The sync and mtmsr instructions are examples of 
execution- synchronizing instructions . 

The isync instruction is concerned mainly with the instruction stream in the processor on 
which it is executed, whereas, sync is looking outward towards the caches and memory and 
is concerned with data arriving at memory where it is visible to other processors in a 
multiprocessor environment, (e.g. cache block store, cache block flush, etc.) 

All context-synchronizing instructions are execution- synchronizing. Unlike a context 
synchronizing operation, an execution synchronizing instruction need not ensure that the 
instructions following it execute in the context established by that instruction. This new 
context becomes effective sometime after the execution synchronizing instruction 
completes and before or at a subsequent context synchronizing operation. 
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4.1.6 Exception Summary 

PowerPC processors have an exception mechanism for handling system functions and error 
conditions in an orderly way. The exception model is defined by the OEA. There are two 
kinds of exceptions — those caused directly by the execution of an instruction and those 
caused by an asynchronous event. Either may cause components of the system software to 
be invoked. 



Exceptions can be caused directly by the execution of an instruction as follows: 

• An attempt to execute an illegal instruction causes the illegal instruction (program 
exception) error handler to be invoked. An attempt by a user-level program to 
execute the supervisor-level instructions listed below causes the privileged 
instruction (program exception) handler to be invoked. 

The PowerPC architecture provides the following supervisor-level instructions: 

dcbi, mfmsr, mfspr, mfsr, mfsrin, mtmsr, mtspr, mtsr, mtsrin, rfi, tibia, tlbie, 
and tlbsync (defined by OEA). 

NOTE: The privilege level of the mfspr and mtspr instructions depends on the 
SPR encoding. 


• The execution of a defined instruction using an invalid form causes either the illegal 
instruction error handler or the privileged instruction handler to be invoked. 

• The execution of an optional instruction that is not provided by the implementation 
causes the illegal instruction error handler to be invoked. 

• An attempt to access memory in a manner that violates memory protection, or an 
attempt to access memory that is not available (page fault), causes the DSI exception 
handler or ISI exception handler to be invoked. 

• An attempt to access memory with an effective address alignment that is invalid for 
the instruction causes the alignment exception handler to be invoked. 

• The execution of an sc instruction permits a program to call on the system to perform 
a service, by causing a system call exception handler to be invoked. 

• The execution of a trap instruction invokes the program exception trap handler. 

• The execution of a floating-point instruction when floating-point instructions are 
disabled invokes the floating-point unavailable exception handler. 

• The execution of an instruction that causes a floating-point exception that is enabled 
invokes the floating-point enabled exception handler. 

• The execution of a floating-point instruction that requires system software assistance 
causes the floating-point assist exception handler to be invoked. The conditions 
under which such software assistance is required are implementation-dependent. 


Exceptions caused by asynchronous events are described in Chapter 6, “Exceptions.” 
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4.2 PowerPC UISA Instructions 

The PowerPC user instruction set architecture (UISA) includes the base user-level 
instruction set (excluding a few user-level cache-control, synchronization, and time base 
instructions), user-level registers, programming model, data types, and addressing modes. 
This section discusses the instructions defined in the UISA. 

4.2.1 Integer Instructions 

The integer instructions consist of the following: 

• Integer arithmetic instructions 

• Integer compare instructions 

• Integer logical instructions 

• Integer rotate and shift instructions 

Integer instructions use the content of the GPRs as source operands and place results into 
GPRs. Integer arithmetic, shift, rotate, and string move instructions may update or read 
values from the XER, and the condition register (CR) fields may be updated if the Re bit of 
the instruction is set. 

These instructions treat the source operands as signed integers unless the instruction is 
explicitly identified as performing an unsigned operation. For example, Multiply High- 
Word Unsigned (mulhwu) and Divide Word Unsigned (divwu) instructions interpret both 
operands as unsigned integers. 

The integer instructions that are coded to update the condition register, and the integer 
arithmetic instruction, addic., set CR bits 0-3 (CRO) to characterize the result of the 
operation. CRO is set to reflect a signed comparison of the result to zero. 

The integer arithmetic instructions, addic, addic., subfic, addc, subfc, adde, subfe, 
addme, subfme, addze, and subfze, always set the XER bit, CA, to reflect the carry out of 
bit 0. Integer arithmetic instructions with the overflow enable (OE) bit set in the instruction 
encoding (instructions with o suffix) cause the XER[SO] and XER[OV] to reflect an 
overflow of the result. These integer arithmetic instructions reflect the overflow of the 32- 
bit result. 

Instructions that select the overflow option (enable XER[OV]) or that set the XER carry bit 
(CA) may delay the execution of subsequent instructions. 

Unless otherwise noted, when CRO and the XER are set, they characterize the value placed 
in the target register. 
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4. 2. 1.1 Integer Arithmetic Instructions 

Table 4-1 lists the integer arithmetic instructions for the PowerPC processors. 


Table 4-1. Integer Arithmetic Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Add Immediate 

addi 

rD,rA,SIMM 

The sum (rA|0) + SIMM is placed into rD. 

Add Immediate 
Shifted 

addis 

rD,rA,SIMM 

The sum (rA|0) + (SIMM | 0x0000) is placed into rD. 

Add 

add 

add. 

addo 

addo. 

rD,rA,rB 

The sum (rA) + (rB) is placed into rD. 

add Add 

add. Add with CR Update. The dot suffix enables the update of 

CR0. 

addo Add with Overflow Enabled. The o suffix enables the overflow 

bit (SO, OV) in the XER. 

addo. Add with Overflow and CR Update. The o. suffix enables the 

update of CR0 and enables the overflow bit (SO,OV) in the 
XER. 

Subtract From 

subf 

subf. 

subfo 

subfo. 

rD,rA,rB 

The sum -> (rA) + (rB) +1 is placed into rD. 

subf Subtract From 

subf. Subtract from with CR Update. The dot suffix enables the 

update of CR0. 

subfo Subtract from with Overflow Enabled. The o suffix enables the 

overflow bits (SO,OV) in the XER. 

subfo. Subtract from with Overflow and CR Update. The o. suffix 
enables the update of CR0 and enables the overflow bits 
(SO.OV) in the XER. 

Add Immediate 
Carrying 

addic 

rD,rA,SIMM 

The sum (rA) + SIMM is placed into rD. 

Add Immediate 
Carrying and 
Record 

addic. 

rD,rA,SIMM 

The sum (rA) + SIMM is placed into rD. CR0 is updated. 

Subtract from 

Immediate 

Carrying 

subfic 

rD,rA,SIMM 

The sum -> (rA) + SIMM + 1 is placed into rD. 

Add Carrying 

addc 

addc. 

addco 

addco. 

rD,rA,rB 

The sum (rA) + (rB) is placed into rD. 

addc Add Carrying 

addc. Add Carrying with CR Update. The dot suffix enables the 

update of CR0. 

addco Add Carrying with Overflow Enabled. The o suffix enables the 
overflow bits (SO,OV) in the XER. 

addco. Add Carrying with Overflow and CR Update. The o. suffix 
enables the update of CR0 and enables the overflow bits 
(SO.OV) in the XER. 









































Table 4-1. Integer Arithmetic Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Subtract from 

subfc 

rD,rA,rB 

The sum -> (rA) + (rB) + 1 is placed into rD. 

Carrying 

subfc. 

subfco 

subfco. 


subfc Subtract from Carrying 

subfc. Subtract from Carrying with CRO Update. The dot suffix 
enables the update of CRO. 

subfco Subtract from Carrying with Overflow. The o suffix enables the 
overflow bits (SO.OV) in the XER. 

subfco. Subtract from Carrying with Overflow and CRO Update. The 
o. suffix enables the update of CRO and enables the overflow 
bits (SO.OV) in the XER. 

Add 

adde 

rD,rA,rB 

The sum (rA) + (rB) + XER[CA] is placed into rD. 

Extended 

adde. 

addeo 

addeo. 


adde Add Extended 

adde. Add Extended with CR Update. The dot suffix enables the 

update of CRO. 

addeo Add Extended with Overflow. The o suffix enables the 
overflow bits (SO,OV) in the XER. 

addeo. Add Extended with Overflow and CR Update. The o. suffix 
enables the update of CRO and enables the overflow bits 
(SO.OV) in the XER. 

Subtract from 

subfe 

rD,rA,rB 

The sum -> (rA) + (rB) + XER[CA] is placed into rD. 

Extended 

subfe. 

subfeo 

subfeo. 


subfe Subtract from Extended 

subfe. Subtract from Extended with CR Update. The dot suffix 
enables the update of CRO. 

subfeo Subtract from Extended with Overflow. The o suffix enables 
the overflow bits (SO,OV) in the XER. 
subfeo. Subtract from Extended with Overflow and CR Update. The o. 
suffix enables the update of CRO and enables the overflow 
(SO.OV) bits in the XER. 

Add to Minus 

addme 

rD,rA 

The sum (rA) + XER[CA] added to OxFFFF FFFF is placed into rD. 

One Extended 

addme. 

addmeo 

addmeo. 


addme Add to Minus One Extended 

addme. Add to Minus One Extended with CR Update. The dot suffix 
enables the update of CRO. 

addmeo Add to Minus One Extended with Overflow. The o suffix 
enables the overflow bits (SO.OV) in the XER. 
addmeo. Add to Minus One Extended with Overflow and CR Update. 
The o. suffix enables the update of CRO and enables the 
overflow (SO.OV) bits in the XER. 

Subtract from 

subfme 

rD,rA 

The sum -> (rA) + XER[CA] added to OxFFFF FFFF is placed into rD. 

Minus One 
Extended 

subfme. 

subfmeo 

subfmeo. 


subfme Subtract from Minus One Extended 

subfme. Subtract from Minus One Extended with CR Update. The dot 
suffix enables the update of CRO. 

subfmeo Subtract from Minus One Extended with Overflow. The o 
suffix enables the overflow bits (SO.OV) in the XER. 
subfmeo. Subtract from Minus One Extended with Overflow and CR 

Update. The o. suffix enables the update of CRO and enables 
the overflow bits (SO.OV) in the XER. 





























Table 4-1. Integer Arithmetic Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Add to Zero 
Extended 

addze 

addze. 

addzeo 

addzeo. 

rD,rA 

The sum (rA) + XER[CA] is placed into rD. 

addze Add to Zero Extended 

addze. Add to Zero Extended with CR Update. The dot suffix enables 
the update of CRO. 

addzeo Add to Zero Extended with Overflow. The o suffix enables the 
overflow bits (SO,OV) in the XER. 

addzeo. Add to Zero Extended with Overflow and CR Update. The o. 
suffix enables the update of CRO and enables the overflow 
bits (SO.OV) in the XER. 

Subtract from 
Zero Extended 

subfze 

subfze. 

subfzeo 

subfzeo. 

rD,rA 

The sum -> (rA) + XER[CA] is placed into rD. 

subfze Subtract from Zero Extended 

subfze. Subtract from Zero Extended with CR Update. The dot suffix 
enables the update of CRO. 

subfzeo Subtract from Zero Extended with Overflow. The o suffix 
enables the overflow bits (SO,OV) in the XER. 

subfzeo. Subtract from Zero Extended with Overflow and CR Update. 
The o. suffix enables the update of CRO and enables the 
overflow bits (SO,OV) in the XER. 

Negate 

neg 

neg. 

nego 

nego. 

rD,rA 

The sum -> (rA) + 1 is placed into rD. 

neg Negate 

neg. Negate with CR Update. The dot suffix enables the update of 

CRO. 

nego Negate with Overflow. The o suffix enables the overflow bits 

(SO.OV) in the XER. 

nego. Negate with Overflow and CR Update. The o. suffix enables 

the update of CRO and enables the overflow bits (SO.OV) in 
the XER. 

Multiply Low 
Immediate 

mulli 

rD,rA,SIMM 

The low-order 32 bits of the 64-bit product (rA) * SIMM are placed into 
rD. 

This instruction can be used with mulhwxto calculate a full 64-bit 
product. 

Multiply Low 

mullw 

mullw, 

mullwo 

mullwo. 

rD,rA,rB 

The low order 32-bits of the 64 bit product (rA) * (rE3) are placed into 

register rD. 

This instruction can be used with mulhwxto calculate a full 64-bit 

product. 

mullw Multiply Low 

mullw. Multiply Low with CR Update. The dot suffix enables the 
update of CRO. 

mullwo Multiply Low with Overflow. The o suffix enables the overflow 
bits (SO.OV) in the XER. 

mullwo. Multiply Low with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables the 
overflow bits (SO.OV) in the XER. 




























Table 4-1. Integer Arithmetic Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Multiply Hiqh 
Word 

mulhw 

mulhw. 

rD,rA,rB 

The contents of rA and rB are interpreted as 32-bit signed integers. The 
64-bit product is formed. The high-order 32 bits of the 64-bit product are 
placed into rD. 

mulhw Multiply High Word 

mulhw. Multiply High Word with CR Update. The dot suffix enables 
the update of CRO. 

Multiply High 
Word Unsigned 

mulhwu 

mulhwu. 

rD,rA,rB 

The contents of rA and of rB are interpreted as 32-bit unsigned integers. 
The 64-bit product is formed. The high-order 32-bits of the 64-bit product 
are placed into rD. 

mulhwu Multiply High Word Unsigned 

mulhwu. Multiply High Word Unsigned with CR Update. The dot suffix 
enables the update of CRO. 

Divide Word 

divw 

divw. 

divwo 

divwo. 

rD,rA,rB 

The dividend is the signed value of rA. The divisor is the signed value of 
rB. The low-order 32-bits of the 64 bit quotient are placed into rD. The 
remainder is not supplied as a result. 

divw Divide Word 

divw. Divide Word with CR Update. The dot suffix enables the update 
of CRO. 

divwo Divide Word with Overflow. The o suffix enables the overflow 
bits (SO.OV) in the XER. 

divwo. Divide Word with Overflow and CR Update. The o. suffix 
enables the update of CRO and enables the overflow bits 
(SO.OV) in the XER. 

Divide Word 
Unsigned 

divwu 

divwu. 

divwuo 

divwuo. 

rD,rA,rB 

The dividend is the value in rA. The divisor is the value in rB. The low- 
order 32-bits of the 64 bit quotient are placed into rD. The remainder is 
not supplied as a result. 

divwu Divide Word Unsigned 

divwu. Divide Word Unsigned with CR Update. The dot suffix 
enables the update of CRO. 

divwuo Divide Word Unsigned with Overflow. The o suffix enables the 
overflow bits (SO,OV) in the XER. 

divwuo. Divide Word Unsigned with Overflow and CR Update. The o. 
suffix enables the update of CRO and enables the overflow 
bits (SO.OV) in the XER. 


Although there is no “Subtract Immediate’' instruction, its effect can be achieved by using 
an addi instruction with the immediate operand negated. Simplified mnemonics are 
provided that include this negation. The subf instructions subtract the second operand (rA) 
from the third operand (rB). Simplified mnemonics are provided in which the third operand 
is subtracted from the second operand. See Appendix F, “Simplified Mnemonics,” for 
examples. 

4. 2. 1.2 Integer Compare Instructions 

The integer compare instructions algebraically or logically compare the contents of register 
rA with either the zero-extended value of the UIMM operand, the sign-extended value of 
the SIMM operand, or the contents of register rB. The comparison is signed for the cmpi 
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and cmp instructions, and unsigned for the cmpli and cmpl instructions. Table 4-2 
summarizes the integer compare instructions. 

The integer compare instructions (shown in Table 4-2) set one of the leftmost three bits of 
the designated CR field, and clear the other two. XER[SO] is copied into bit 3 of the CR 
field. 


Table 4-2. Integer Compare Instructions 


Name 

Mnemonic 

Operand Syntax 

Operation 

Compare 

Immediate 

cmpi 

crfD, L,rA, SIMM 

The value in register rA is compared with the sign-extended value of 
the SIMM operand, treating the operands as signed integers. The 
result of the comparison is placed into the CR field specified by 
operand crfD. 

Compare 

cmp 

crfD,L,rA,rB 

The value in register rA is compared with the value in register rB, 
treating the operands as signed integers. The result of the comparison 
is placed into the CR field specified by operand crfD. 

Compare 

Logical 

Immediate 

cmpli 

crfD,L,rA,UIMM 

The value in register rA is compared with 0x0000 | UIMM, treating the 
operands as unsigned integers. The result of the comparison is placed 
into the CR field specified by operand crfD. 

Compare 

Logical 

cmpl 

crfD,L,rA,rB 

The value in register rA is compared with the value in register rB, 
treating the operands as unsigned integers. The result of the 
comparison is placed into the CR field specified by operand crfD. 


The crfD operand can be omitted if the result of the comparison is to be placed in CRO. 
Otherwise the target CR field must be specified in the instruction crfD field, using an 
explicit field number. 

For information on simplified mnemonics for the integer compare instructions see 
Appendix F, “Simplified Mnemonics.” 

4.2.1 .3 Integer Logical Instructions 

The logical instructions shown in Table 4-3 perform bit-parallel operations on 32-bit 
operands. Fogical instructions with the CR updating enabled (uses dot suffix) and 
instructions andi. and andis. set CR field CRO (bits 0 to 2) to characterize the result of the 
logical operation. Fogical instructions without CR update and the remaining logical 
instructions do not modify the CR. Fogical instructions do not affect the XER[SO], 
XER[OV], and XER[CA] bits. 
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See Appendix F, “Simplified Mnemonics,” for simplified mnemonic examples for integer 
logical operations. 


Table 4-3. Integer Logical Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

AND 

Immediate 

andi. 

rA,rS,UIMM 

The contents of rS are ANDed with 0x0000 | UIMM and the result is placed 
into rA. 

CR0 is updated. 

AND 

Immediate 

Shifted 

andis. 

rA,rS,UIMM 

The contents of rS are ANDed with UIMM | 0x0000 and the result is placed 
into rA. 

CR0 is updated. 

OR 

Immediate 

ori 

rA,rS,UIMM 

The contents of rS are ORed with 0x0000 || UIMM and the result is placed 
into rA. 

The preferred no-op is ori 0,0,0 

OR 

Immediate 

Shifted 

oris 

rA,rS,UIMM 

The contents of rS are ORed with UIMM || 0x0000 and the result is placed 
into rA. 

XOR 

Immediate 

xori 

rA,rS,UIMM 

The contents of rS are XORed with 0x0000 || UIMM and the result is placed 
into rA. 

XOR 

Immediate 

Shifted 

xoris 

rA,rS,UIMM 

The contents of rS are XORed with UIMM | 0x0000 and the result is placed 
into rA. 

AND 

and 

and. 

rA,rS,rB 

The contents of rS are ANDed with the contents of register rB and the result 
is placed into rA. 

and AND 

and. AND with CR Update. The dot suffix enables the update of CR0. 

OR 

or 

or. 

rA,rS,rB 

The contents of rS are ORed with the contents of rB and the result is placed 
into rA. 

or OR 

or. OR with CR Update. The dot suffix enables the update of CR0. 

XOR 

xor 

xor. 

rA,rS,rB 

The contents of rS are XORed with the contents of rB and the result is 
placed into rA. 

xor XOR 

xor. XOR with CR Update. The dot suffix enables the update of CR0. 

NAND 

nand 

nand. 

rA,rS,rB 

The contents of rS are ANDed with the contents of rB and the one’s 
complement of the result is placed into rA. 

nand NAND 

nand. NAND with CR Update. The dot suffix enables the update of CR0. 
Note: t nandx, with rS = rB, can be used to obtain the one's complement. 

NOR 

nor 

nor. 

rA,rS,rB 

The contents of rS are ORed with the contents of rB and the one’s 
complement of the result is placed into rA. 

nor NOR 

nor. NOR with CR Update. The dot suffix enables the update of CR0. 

Noted norx, with rS = rB, can be used to obtain the one's complement. 





















































Table 4-3. Integer Logical Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Equivalent 

eqv 

eqv. 

rA,rS,rB 

The contents of rS are XORed with the contents of rB and the 
complemented result is placed into rA. 

eqv Equivalent 

eqv. Equivalent with CR Update. The dot suffix enables the update of 

CRO. 

AND with 
Complement 

andc 

andc. 

rA,rS,rB 

The contents of rS are ANDed with the one’s complement of the contents of 
rB and the result is placed into rA. 

andc AND with Complement 

andc. AND with Complement with CR Update. The dot suffix enables the 
update of CRO. 

OR with 
Complement 

ore 

ore. 

rA,rS,rB 

The contents of rS are ORed with the complement of the contents of rB and 
the result is placed into rA. 

ore OR with Complement 

ore. OR with Complement with CR Update. The dot suffix enables the 

update of CRO. 

Extend Sign 
Byte 

extsb 

extsb. 

rA,rS 

The contents of the low-order eight bits of rS are placed into the low-order 
eight bits of rA. Bit 24 is placed into the remaining high-order bits of rA. 

extsb Extend Sign Byte 

extsb. Extend Sign Byte with CR Update. The dot suffix enables the 
update of CRO. 

Extend Sign 
Half Word 

extsh 

extsh. 

rA,rS 

The contents of the low-order 16 bits of rS are placed into rA. Bit 16 is 
placed into the remaining high-order bits of rA. 

extsh Extend Sign Half Word 

extsh. Extend Sign Half Word with CR Update. The dot suffix enables the 
update of CRO. 

Count 

Leading 

Zeros Word 

cntlzw 

cntlzw. 

rA,rS 

A count of the number of consecutive zero bits starting at bit 0 of rS is 
placed into rA. This number ranges from 0 to 32, inclusive. 

If Rc = 1 (dot suffix), LT is cleared in CRO. 

cntlzw Count Leading Zeros Word 

cntlzw. Count Leading Zeros Word with CR Update. The dot suffix enables 
the update of the CR. 


4. 2. 1.4 Integer Rotate and Shift Instructions 

Rotation operations are performed on data from a GPR, and the result, or a portion of the 
result, is returned to a GPR. The rotation operations rotate a 32-bit quantity left by a 
specified number of bit positions. Bits that exit from position 0 enter at position 31. 

The rotate and shift instructions employ a mask generator. The mask is 32 bits long and 
consists of ‘ 1’ bits from a start bit, Mstart, through and including a stop bit, Mstop, and ‘O’ 
bits elsewhere. 

The values of Mstart and Mstop range from 0 to 3 1 . If Mstart > Mstop, the ‘1’ bits wrap 
around from position 3 1 to position 0. 
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Thus the mask is formed as follows: 


if Mstart < Mstop then 

mask[mstart-mstop] = ones 
mask[all other bits] = zeros 
else 

mask[mstart-31] = ones 
mask[0-mstop] = ones 
mask[all other bits] = zeros 

It is not possible to specify an all-zero mask. The use of the mask is described in the 
following sections. 

If CR updating is enabled, rotate and shift instructions set CR0[0-2] according to the 
contents of rA at the completion of the instruction. Rotate and shift instructions do not 
change the values of XER[OV] and XER[SO] bits. Rotate and shift instructions, except 
algebraic right shifts, do not change the XER[CA] bit. 

See Appendix F, “Simplified Mnemonics,” for a complete list of simplified mnemonics that 
allows simpler coding of often-used functions such as clearing the leftmost or rightmost 
bits of a register, left justifying or right justifying an arbitrary field, and simple rotates and 
shifts. 

4. 2. 1.4.1 Integer Rotate Instructions 

Integer rotate instructions rotate the contents of a register. The result of the rotation is either 
inserted into the target register under control of a mask (if a mask bit is 1 the associated bit 
of the rotated data is placed into the target register, and if the mask bit is 0 the associated 
bit in the target register is either zeroed or unchanged), or ANDed with a mask before being 
placed into the target register. 

Rotate left instructions allow apparent right-rotation of the contents of a register to be 
performed by a left-rotation of 32 - n, where n is the number of bits by which to rotate right. 
The integer rotate instructions are summarized in Table 4-4. 


Table 4-4. Integer Rotate Instructions 


Name 

Mnemonic 

Operand Syntax 

Operation 

Rotate Left 
Word 
Immediate 
then AND with 
Mask 

rlwinm 

rlwinm. 

rA,rS,SH,MB,ME 

The contents of register rS are rotated left by the number of bits 
specified by operand SH. A mask is generated having 1 bits from 
the bit specified by operand MB through the bit specified by 
operand ME and 0 bits elsewhere. The rotated data is ANDed with 
the generated mask and the result is placed into register rA. 

rlwinm Rotate Left Word Immediate then AND with Mask 

rlwinm. Rotate Left Word Immediate then AND with Mask with 
CR Update. The dot suffix enables the update of CRO. 
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Table 4-4. Integer Rotate Instructions (Continued) 


Name 

Mnemonic 

Operand Syntax 

Operation 

Rotate Left 
Word then 

AND with 

Mask 

rlwnm 

rlwnm. 

rA,rS,rB,MB,ME 

The contents of rS are rotated left by the number of bits specified 
by operand in the low-order five bits of rB. A mask is generated 
having 1 bits from the bit specified by operand MB through the bit 
specified by operand ME and 0 bits elsewhere. The rotated word 
is ANDed with the generated mask and the result is placed into rA. 

rlwnm Rotate Left Word then AND with Mask 

rlwnm. Rotate Left Word then AND with Mask with CR Update. 

The dot suffix enables the update of CRO. 

Rotate Left 
Word 
Immediate 
then Mask 
Insert 

rlwimi 

rlwimi. 

rA,rS,SH,MB,ME 

The contents of rS are rotated left by the number of bits specified 
by operand SH. A mask is generated having 1 bits from the bit 
specified by operand MB through the bit specified by operand ME 
and 0 bits elsewhere. The rotated word is inserted into rA under 
control of the generated mask. 

rlwimi Rotate Left Word Immediate then Mask 

rlwimi. Rotate Left Word Immediate then Mask Insert with CR 

Update. The dot suffix enables the update of CRO. 


4. 2. 1.4. 2 Integer Shift Instructions 

The integer shift instructions perform left and right shifts. Immediate-form logical 
(unsigned) shift operations are obtained by specifying masks and shift values for certain 
rotate instructions. Simplified mnemonics (shown in Appendix F, “Simplified 
Mnemonics”) are provided to make coding of such shifts simpler and easier to understand. 

Any shift right algebraic instruction, followed by addze, can be used to divide quickly by 
2". The setting of XER[CA] by the shift right algebraic instruction is independent of mode. 

Multiple-precision shifts can be programmed as shown in Appendix C, “Multiple-Precision 
Shifts.” 

The integer shift instructions are summarized in Table 4-5. 


Table 4-5. Integer Shift Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Shift Left 
Word 

slw 

slw. 

rA,rS,rB 

The contents of rS are shifted left the number of bits specified by the low- 
order six bits of rB. Bits shifted out of position 0 are lost. Zeros are supplied 
to the vacated positions on the right. The 32-bit result is placed into rA. 

slw Shift Left Word 

slw. Shift Left Word with CR Update. The dot suffix enables the 

update of CRO. 

Shift Right 
Word 

srw 

srw. 

rA,rS,rB 

The contents of rS are shifted right the number of bits specified by the low- 
order six bits of rB. Bits shifted out of position 31 are lost. Zeros are supplied 
to the vacated positions on the left. The 32-bit result is placed into rA. 

srw Shift Right Word 

srw. Shift Right Word with CR Update. The dot suffix enables the 

update of CRO. 
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Table 4-5. Integer Shift Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Shift Right 
Algebraic 
Word 
Immediate 

srawi 

srawi. 

rA,rS,SH 

The contents of rS are shifted right the number of bits specified by operand 
SH. Bits shifted out of position 31 are lost. Bit 0 of rS is replicated to fill the 
vacated positions on the left. The 32-bit result is placed into rA. 

srawi Shift Right Algebraic Word Immediate 

srawi. Shift Right Algebraic Word Immediate with CR Update. The dot 

suffix enables the update of CR0. 

Shift Right 

Algebraic 

Word 

sraw 

sraw. 

rA,rS,rB 

The contents of rS are shifted right the number of bits specified by the low- 
order six bits of rB. Bits shifted out of position 31 are lost. Bit 0 of rS is 
replicated to fill the vacated positions on the left. The 32-bit result is placed 
into rA. 

sraw Shift Right Algebraic Word 

sraw. Shift Right Algebraic Word with CR Update. The dot suffix 

enables the update of CR0. 


4.2.2 Floating-Point Instructions 

This section describes the floating-point instructions, which include the following: 

• Floating-point arithmetic instructions 

• Floating-point multiply-add instructions 

• Floating-point rounding and conversion instructions 

• Floating-point compare instructions 

• Floating-point status and control register instructions 

• Floating-point move instructions 

NOTE: MSR[FP] must be set in order for any of these instructions (including the 
floating-point loads and stores) to be executed. 

If MSRfFP] = 0 when any floating-point instruction is attempted, the floating- 
point unavailable exception is taken (see Section 6.4.8, “Floating-Point 
Unavailable Exception (0x00800)”). 

See Section 4.2.3, “Load and Store Instructions,” for information about floating- 
point loads and stores. 

The PowerPC architecture supports a floating-point system as defined in the IEEE-754 
standard, but requires software support to conform with that standard. Floating-point 
operations conform to the IEEE-754 standard, with the exception of operations performed 
with the fmadd, fres, fsel, and frsqrte instructions, or if software sets the non-IEEE mode 
bit (NI) in the FPSCR. Refer to Section 3.3, “Floating-Point Execution Models — UISA,” 
for detailed information about the floating-point formats and exception conditions. Also, 
refer to Appendix D, “Floating-Point Models,” for more information on the floating-point 
execution models used by the PowerPC architecture. 


4-20 


PowerPC Microprocessor Family: The Programming Environments 


















4.2.2. 1 Floating-Point Arithmetic Instructions 

The floating-point arithmetic instructions are summarized in Table 4-6. 


Table 4-6. Floating-Point Arithmetic Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Floating 

Add 

(Double- 

Precision) 

fadd 

fadd. 

frD,frA,frB 

The floating-point operand in register frA is added to the floating-point 
operand in register frB. If the most significant bit of the resultant significand 
is not a one the result is normalized. The result is rounded to the target 
precision under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

fadd Floating Add (Double-Precision) 

fadd. Floating Add (Double-Precision) with CR Update. The dot suffix 

enables the update of CR1 . 

Floating 

Add Single 

fadds 

fadds. 

frD,frA,frB 

The floating-point operand in register frA is added to the floating-point 
operand in register frB. If the most significant bit of the resultant significand 
is not a one, the result is normalized. The result is rounded to the target 
precision under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

fadds Floating Add Single 

fadds. Floating Add Single with CR Update. The dot suffix enables the 
update of CR1 . 

Floating 

Subtract 

(Double- 

Precision) 

fsub 

fsub. 

frD,frA,frB 

The floating-point operand in register frB is subtracted from the floating- 
point operand in register frA. If the most significant bit of the resultant 
significand is not 1 , the result is normalized. The result is rounded to the 
target precision under control of the floating-point rounding control field RN 
of the FPSCR and placed into register frD. 

fsub Floating Subtract (Double-Precision) 

fsub. Floating Subtract (Double-Precision) with CR Update. The dot 

suffix enables the update of CR1 . 

Floating 

Subtract 

Single 

fsubs 

fsubs. 

frD,frA,frB 

The floating-point operand in register frB is subtracted from the floating- 
point operand in register frA. If the most significant bit of the resultant 
significand is not 1 , the result is normalized. The result is rounded to the 
target precision under control of the floating-point rounding control field RN 
of the FPSCR and placed into frD. 

fsubs Floating Subtract Single 

fsubs. Floating Subtract Single with CR Update. The dot suffix enables 
the update of CR1 . 

Floating 

Multiply 

(Double- 

Precision) 

fmul 

fmul. 

frD,frA,frC 

The floating-point operand in register frA is multiplied by the floating-point 
operand in register frC. 

fmul Floating Multiply (Double-Precision) 

fmul. Floating Multiply (Double-Precision) with CR Update. The dot 

suffix enables the update of CR1 . 

Floating 

Multiply 

Single 

fmuls 

fmuls. 

frD,frA,frC 

The floating-point operand in register frA is multiplied by the floating-point 
operand in register frC. 

fmuls Floating Multiply Single 

fmuls. Floating Multiply Single with CR Update. The dot suffix enables 

the update of CR1 . 

































Table 4-6. Floating-Point Arithmetic Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Floating 

Divide 

(Double- 

Precision) 

fdiv 

fdiv. 

frD,frA,frB 

The floating-point operand in register frA is divided by the floating-point 
operand in register frB. No remainder is preserved. 

fdiv Floating Divide (Double-Precision) 

fdiv. Floating Divide (Double-Precision) with CR Update. The dot 

suffix enables the update of CR1 . 

Floating 

Divide 

Single 

fdivs 

fdivs. 

frD,frA,frB 

The floating-point operand in register frA is divided by the floating-point 
operand in register frB. No remainder is preserved. 

fdivs Floating Divide Single 

fdivs. Floating Divide Single with CR Update. The dot suffix enables 

the update of CR1 . 

Floating 

Square 

Root 

(Double- 

Precision) 

fsqrt 

fsqrt. 

frD,frB 

The square root of the floating-point operand in register frB is placed into 
register frD. 

fsqrt Floating Square Root (Double-Precision) 

fsqrt. Floating Square Root (Double-Precision) with CR Update. The 

dot suffix enables the update of CR1 . 

This instruction is optional. 

Floating 

Square 

Root 

Single 

fsqrts 

fsqrts. 

frD, frB 

The square root of the floating-point operand in register frB is placed into 
register frD. 

fsqrts Floating Square Root Single 

fsqrts. Floating Square Root Single with CR Update. The dot suffix 
enables the update of CR1 . 

This instruction is optional. 

Floating 

Reciprocal 

Estimate 

Single 

fres 

fres. 

frD,frB 

A single-precision estimate of the reciprocal of the floating-point operand in 
register frB is placed into frD. The estimate placed into frD is correct to a 
precision of one part in 256 of the reciprocal of frB. 

fres Floating Reciprocal Estimate Single 

fres. Floating Reciprocal Estimate Single with CR Update. The dot 

suffix enables the update of CR1 . 

This instruction is optional. 

Floating 

Reciprocal 

Square 

Root 

Estimate 

frsqrte 

frsqrte. 

frD,frB 

A double-precision estimate of the reciprocal of the square root of the 
floating-point operand in register frB is placed into frD. The estimate 
placed into frD is correct to a precision of one part in 32 of the reciprocal of 
the square root of frB. 

frsqrte Floating Reciprocal Square Root Estimate 
frsqrte. Floating Reciprocal Square Root estimate with CR Update. The 
dot suffix enables the update of CR1 . 

This instruction is optional. 

Floating 

Select 

fsel 

frD,frA,frC,frB 

The floating-point operand in frA is compared to the value zero. If the 
operand is greater than or equal to zero, frD is set to the contents of frC. If 
the operand is less than zero or is a NaN, frD is set to the contents of frB. 
The comparison ignores the sign of zero (that is, regards +0 as equal to 
-0). 

fsel Floating Select 

fsel. Floating Select with CR Update. The dot suffix enables the 

update of CR1 . 

This instruction is optional. 





































4. 2. 2. 2 Floating-Point Multiply-Add Instructions 

These instructions combine multiply and add operations without an intermediate rounding 
operation. The fractional part of the intermediate product is 106 bits wide, and all 106 bits 
take part in the add/subtract portion of the instruction. 

Status bits are set as follows: 

• Overflow, underflow, and inexact exception bits, the FR and FI bits, and the FPRF 
field are set based on the final result of the operation, and not on the result of the 
multiplication. 

• Invalid operation exception bits are set as if the multiplication and the addition were 
performed using two separate instructions (fmuls, followed by fadds or fsubs). That 
is, multiplication of infinity by zero or of anything by an SNaN, and/or addition of 
an SNaN, cause the corresponding exception bits to be set. 

The floating-point multiply-add instructions are summarized in Table 4-7. 


Table 4-7. Floating-Point Multiply-Add Instructions 


Name 

Mnemonic 

Operand Syntax 

Operation 

Floating 

Multiply- 

Add 

(Double- 

Precision) 

fmadd 

fmadd. 

frD,frA,frC,frB 

The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register frB 
is added to this intermediate result. 

fmadd Floating Multiply-Add (Double-Precision) 

fmadd. Floating Multiply-Add (Double-Precision) with CR Update. 

The dot suffix enables the update of the CR1 . 

Floating 

Multiply- 

Add 

Single 

fmadds 

fmadds. 

frD,frA,frC,frB 

The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register frB 
is added to this intermediate result. 

fmadds Floating Multiply-Add Single 

fmadds. Floating Multiply-Add Single with CR Update. The dot suffix 
enables the update of the CR1 . 

Floating 

Multiply- 

Subtract 

(Double- 

Precision) 

fmsub 

fmsub. 

frD,frA,frC,frB 

The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register frB 
is subtracted from this intermediate result. 

fmsub Floating Multiply-Subtract (Double-Precision) 

fmsub. Floating Multiply-Subtract (Double-Precision) with CR 

Update. The dot suffix enables the update of the CR1 . 

Floating 

Multiply- 

Subtract 

Single 

fmsubs 

fmsubs. 

frD,frA,frC,frB 

The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register frB 
is subtracted from this intermediate result. 

fmsubs Floating Multiply-Subtract Single 

fmsubs. Floating Multiply-Subtract Single with CR Update. The dot 
suffix enables the update of the CR1 . 
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Table 4-7. Floating-Point Multiply-Add Instructions (Continued) 


Name 

Mnemonic 

Operand Syntax 

Operation 

Floating 

Negative 

Multiply- 

Add 

(Double- 

Precision) 

fnmadd 

fnmadd. 

frD,frA,frC,frB 

The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register frB 
is added to this intermediate result. 

fnmadd Floating Negative Multiply-Add (Double-Precision) 
fnmadd. Floating Negative Multiply-Add (Double-Precision) with CR 
Update. The dot suffix enables update of the CR1 . 

Floating 

Negative 

Multiply- 

Add 

Single 

fnmadds 

fnmadds. 

frD,frA,frC,frB 

The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register frB 
is added to this intermediate result. 

fnmadds Floating Negative Multiply-Add Single 
fnmadds. Floating Negative Multiply-Add Single with CR Update. The 
dot suffix enables the update of the CR1 . 

Floating 

Negative 

Multiply- 

Subtract 

(Double- 

Precision) 

fnmsub 

fnmsub. 

frD,frA,frC,frB 

The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register frB 
is subtracted from this intermediate result. 

fnmsub Floating Negative Multiply-Subtract (Double-Precision) 
fnmsub. Floating Negative Multiply-Subtract (Double-Precision) with 
CR Update. The dot suffix enables the update of the CR1 . 

Floating 

Negative 

Multiply- 

Subtract 

Single 

fnmsubs 

fnmsubs. 

frD,frA,frC,frB 

The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register frB 
is subtracted from this intermediate result. 

fnmsubs Floating Negative Multiply-Subtract Single 
fnmsubs. Floating Negative Multiply-Subtract Single with CR Update. 
The dot suffix enables the update of the CR1 . 


For more information on multiply-add instructions, refer to Section D.2, “Execution Model 
for Multiply- Add Type Instructions.” 

4. 2. 2. 3 Floating-Point Rounding and Conversion Instructions 

The Floating Round to Single-Precision (frsp) instruction is used to truncate a 64-bit 
double-precision number to a 32-bit single-precision floating-point number. The floating- 
point convert instructions convert a 64-bit double-precision floating-point number to a 32- 
bit signed integer number. 

The PowerPC architecture defines bits 0-31 of floating-point register frD as undefined 
when executing the Floating Convert to Integer Word (fctiw) and Floating Convert to 
Integer Word with Round toward Zero (fctiwz) instructions. The floating-point rounding 
instructions are shown in Table 4-8. 
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Examples of uses of these instructions to perform various conversions can be found in 
Appendix D, “Floating-Point Models.” 


Table 4-8. Floating-Point Rounding and Conversion Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Floating Round 
to Single- 
Precision 

frsp 

frsp. 

frD,frB 

The floating-point operand in frB is rounded to single-precision using the 
rounding mode specified by FPSCR[RN] and placed into frD. 

frsp Floating Round to Single-Precision 

frsp. Floating Round to Single-Precision with CR Update. The dot 

suffix enables the update of the CR1 . 

Floating Convert 
to Integer Word 

fctiw 

fctiw. 

frD,frB 

The floating-point operand in register frB is converted to a 32-bit signed 
integer, using the rounding mode specified by FPSCR[RN], and placed in 
the low-order 32 bits of frD. Bits 0-31 of frD are undefined. 

fctiw Floating Convert to Integer Word 

fctiw. Floating Convert to Integer Word with CR Update. The dot suffix 

enables the update of the CR1 . 

Floating Convert 
to Integer Word 
with Round 
toward Zero 

fctiwz 

fctiwz. 

frD,frB 

The floating-point operand in register frB is converted to a 32-bit signed 
integer, using the rounding mode Round toward Zero, and placed in the low- 
order 32 bits of frD. Bits 0-31 of frD are undefined. 

fctiwz Floating Convert to Integer Word with Round toward Zero 

fctiwz. Floating Convert to Integer Word with Round toward Zero with 

CR Update. The dot suffix enables the update of the CR1 . 


4. 2. 2. 4 Floating-Point Compare Instructions 

Floating-point compare instructions compare the contents of two floating-point registers 
and the comparison ignores the sign of zero (that is +0 = -0). The comparison can be 
ordered or unordered. The comparison sets one bit in the designated CR field and clears the 
other three bits. The FPCC (floating-point condition code) in bits 16-19 of the FPSCR 
(floating-point status and control register) is set in the same way. 

The CR field and the FPCC are interpreted as shown in Table 4-9. 


Table 4-9. CR Bit Settings 


Bit 

Name 

Description 

0 

FL 

(frA) < (frB) 

1 

FG 

(frA) > (frB) 

2 

FE 

(frA) = (frB) 

3 

FU 

(frA)? (frB) (unordered) 
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The floating-point compare instructions are summarized in Table 4-10. 


Table 4-10. Floating-Point Compare Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Floating 

Compare 

Unordered 

fcmpu 

crfD,frA,frB 

The floating-point operand in frA is compared to the floating-point operand 
in frB. The result of the compare is placed into crfD and the FPCC. 

Floating 

Compare 

Ordered 

tempo 

crfD,frA,frB 

The floating-point operand in frA is compared to the floating-point operand 
in frB. The result of the compare is placed into crfD and the FPCC. 


4. 2. 2. 5 Floating-Point Status and Control Register Instructions 

Every FPSCR instruction appears to synchronize the effects of all floating-point 
instructions executed by a given processor. Executing an FPSCR instruction ensures that all 
floating-point instructions previously initiated by the given processor appear to have 
completed before the FPSCR instruction is initiated and that no subsequent floating-point 
instructions appear to be initiated by the given processor until the FPSCR instruction has 
completed. In particular: 

• All exceptions caused by the previously initiated instructions are recorded in the 
FPSCR before the FPSCR instruction is initiated. 

• All invocations of the floating-point exception handler caused by the previously 
initiated instructions have occurred before the FPSCR instruction is initiated. 

• No subsequent floating-point instruction that depends on or alters the settings of any 
FPSCR bits appears to be initiated until the FPSCR instruction has completed. 

Floating-point memory access instructions are not affected by the execution of the FPSCR 
instructions. 

The FPSCR instructions are summarized in Table 4-11. 


Table 4-11. Floating-Point Status and Control Register Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Move from 
FPSCR 

mffs 

mffs. 

frD 

The contents of the FPSCR are placed into bits 32-63 of frD. Bits 0-31 of 
frD are undefined. 

mffs Move from FPSCR 

mffs. Move from FPSCR with CR Update. The dot suffix enables the 

update of the CR1 . 

Move to 
Condition 
Register from 
FPSCR 

merfs 

crfD,crfS 

The contents of FPSCR field specified by operand crfS are copied to the 
CR field specified by operand crfD. All exception bits copied (except FEX 
and VX bits) are cleared in the FPSCR. 
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Table 4-11. Floating-Point Status and Control Register Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Move to 

FPSCR Field 
Immediate 

mtfsfi 

mtfsfi. 

crfD,IMM 

The contents of the IMM field are placed into FPSCR field crfD. The 
contents of FPSCR[FX] are altered only if crfD = 0. 

mtfsfi Move to FPSCR Field Immediate 

mtfsfi. Move to FPSCR Field Immediate with CR Update. The dot 
suffix enables the update of the CR1 . 

Move to 

FPSCR Fields 

mtfsf 

mtfsf. 

FM,frB 

Bits 32-63 of frB are placed into the FPSCR under control of the field 
mask specified by FM. The field mask identifies the 4-bit fields affected. 

Let / be an integer in the range 0-7. If FM[/] = 1 , FPSCR field / (FPSCR 
bits 4*/'through 4*/'+3) is set to the contents of the corresponding fields of 
the lower order 32-bits of frB. 

The contents of FPSCR[FX] are altered only if FM[0] = 1 . 

mtfsf Move to FPSCR Fields 

mtfsf. Move to FPSCR Fields with CR Update. The dot suffix enables 

the update of the CR1 . 

Move to 

FPSCR Bit 0 

mtfsbO 

mtfsbO. 

crbD 

The FPSCR bit location specified by operand crbD is cleared. 

Bits 1 and 2 (FEX and VX) cannot be reset explicitly. 

mtfsbO Move to FPSCR Bit 0 

mtfsbO. Move to FPSCR Bit 0 with CR Update. The dot suffix enables 
the update of the CR1 . 

Move to 

FPSCR Bit 1 

mtfsbl 
mtfsbl . 

crbD 

The FPSCR bit location specified by operand crbD is set. 

Bits 1 and 2 (FEX and VX) cannot be set explicitly. 

mtfsbl Move to FPSCR Bit 1 

mtfsbl. Move to FPSCR Bit 1 with CR Update. The dot suffix enables 
the update of the CR1 . 



4. 2. 2. 6 Floating-Point Move Instructions 

Floating-point move instructions copy data from one FPR to another, altering the sign bit 
(bit 0) as described for the fneg, fabs, and fnabs instructions in Table 4-12. The fneg, fabs, 
and fnabs instructions may alter the sign bit of a NaN. The floating-point move instructions 
do not modify the FPSCR. The CR update option in these instructions controls the placing 
of result status into CR1. If the CR update option is enabled, CR1 is set; otherwise, CR1 is 
unchanged. 

Table 4-12 provides a summary of the floating-point move instructions. 


Table 4-12. Floating-Point Move Instructions 


Name 

Mnemonic 

Operand Syntax 

Operation 

Floating 

Move 

Register 

fmr 

fmr. 

frD,frB 

The contents of frB are placed into frD. 

fmr Floating Move Register 

fmr. Floating Move Register with CR Update. The dot suffix 

enables the update of the CR1 . 
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Table 4-12. Floating-Point Move Instructions (Continued) 


Floating 

Negate 

fneg 

fneg. 

frD,frB 

The contents of frB with bit 0 inverted are placed into frD. 

fneg Floating Negate 

fneg. Floating Negate with CR Update. The dot suffix enables the 

update of the CR1 . 

Floating 

Absolute 

Value 

tabs 

tabs. 

frD, frB 

The contents of frB with bit 0 cleared are placed into frD. 

fabs Floating Absolute Value 

fabs. Floating Absolute Value with CR Update. The dot suffix 

enables the update of the CR1 . 

Floating 

Negative 

Absolute 

Value 

fnabs 

fnabs. 

frD, frB 

The contents of frB with bit 0 set are placed into frD. 

fnabs Floating Negative Absolute Value 

fnabs. Floating Negative Absolute Value with CR Update. The dot 
suffix enables the update of the CR1 . 


4.2.3 Load and Store Instructions 

Load and store instructions are issued and translated in program order; however, the 
accesses can occur out of order. Synchronizing instructions are provided to enforce strict 
ordering. This section describes the load and store instructions, which consist of the 
following: 

• Integer load instructions 

• Integer store instructions 

• Integer load and store with byte-reverse instructions 

• Integer load and store multiple instructions 

• Floating-point load instructions 

• Floating-point store instructions 

• Memory synchronization instructions 

4.2.3. 1 Integer Load and Store Address Generation 

Integer load and store operations generate effective addresses using register indirect with 
immediate index mode (register contents + immediate), register indirect with index mode 
(register contents + register contents), or register indirect mode (register contents only). See 
Section 4. 1.4.2, “Effective Address Calculation,” for information about calculating 
effective addresses. 

NOTE: In some implementations, operations that are not naturally aligned may suffer 
performance degradation. Refer to Section 6.4.6. 1, “Integer Alignment 
Exceptions,” for additional information about load and store address alignment 
exceptions. 
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4. 2. 3. 1.1 Register Indirect with Immediate Index Addressing 
for Integer Loads and Stores 

Instructions using this addressing mode contain a signed 16-bit immediate index 
(d operand) which is sign extended, and added to the contents of a general-purpose register 
specified in the instruction (rA operand) to generate the effective address. If the rA field of 
the instruction specifies rO, a value of zero is added to the immediate index (d operand) in 
place of the contents of rO. The option to specify rA or 0 is shown in the instruction 
descriptions as (rAIO). 

Figure 4-1 shows how an effective address is generated when using register indirect with 
immediate index addressing. 


0 56 1011 15 16 


31 



Figure 4-1. Register Indirect with Immediate Index Addressing 
for Integer Loads and Stores 

4. 2. 3. 1.2 Register Indirect with Index Addressing 
for Integer Loads and Stores 

Instructions using this addressing mode cause the contents of two general-purpose registers 
(specified as operands rA and rB) to be added in the generation of the effective address. A 
zero in place of the rA operand causes a zero to be added to the contents of the general- 
purpose register specified in operand rB (or the value zero for lswi and stswi instructions). 
The option to specify rA or 0 is shown in the instruction descriptions as (rAIO). 
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Figure 4-2 shows how an effective address is generated when using register indirect with 
index addressing. 
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Figure 4-2. Register Indirect with Index Addressing for Integer Loads and Stores 

4. 2. 3. 1.3 Register Indirect Addressing for Integer Loads and Stores 

Instructions using this addressing mode use the contents of the general-purpose register 
specified by the rA operand as the effective address. A zero in the rA operand causes an 
effective address of zero to be generated. The option to specify rA or 0 is shown in the 
instruction descriptions as (rAIO). 
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Figure 4-3 shows how an effective address is generated when using register indirect 
addressing. 
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Figure 4-3. Register Indirect Addressing for Integer Loads and Stores 
4. 2. 3. 2 Integer Load Instructions 

For integer load instructions, the byte, half word, or word addressed by the EA (effective 
address) is loaded into rD. Many integer load instructions have an update form, in which 
rA is updated with the generated effective address. For these forms, if rA Oand rA# rD 
(otherwise invalid), the EA is placed into rA and the memory element (byte, half word, or 
word) addressed by the EA is loaded into rD. 

NOTE: The PowerPC architecture defines load with update instructions with operand 
rA = 0, or rA = rD as invalid forms. 

The default byte and bit ordering is big-endian in the PowerPC architecture; see 
Section 3.1.2, “Byte Ordering,” for information about little-endian byte ordering. 

In some implementations of the architecture, the load algebraic instructions (lha, lhax) and 
the load with update (lbzu, lbzux, lhau, lhaux, lhzu, lhzux, lwzu, lwzux) instructions may 
execute with greater latency than other types of load instructions. Moreover, the load with 
update instructions may take longer to execute in some implementations than the 
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corresponding pair of a non-update load followed by an add instruction to update the 
register. 

Table 4-13 summarizes the integer load instructions. 


Table 4-13. Integer Load Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Load Byte and 
Zero 

Ibz 

rD,d(rA) 

The EA is the sum (rA 0) + d. The byte in memory addressed by the EA is 
loaded into the low-order eight bits of rD. The remaining bits in rD are 
cleared. 

Load Byte and 
Zero Indexed 

Ibzx 

rD,rA,rB 

The EA is the sum (rA 0) + (rB). The byte in memory addressed by the EA is 
loaded into the low-order eight bits of rD. The remaining bits in rD are 
cleared. 

Load Byte and 
Zero with 

Update 

Ibzu 

rD,d(rA) 

The EA is the sum (rA) + d. The byte in memory addressed by the EA is 
loaded into the low-order eight bits of rD. The remaining bits in rD are 
cleared. The EA is placed into rA. 

Load Byte and 
Zero with 

Update Indexed 

Ibzux 

rD,rA,rB 

The EA is the sum (rA) + (rB). The byte in memory addressed by the EA is 
loaded into the low-order eight bits of rD. The remaining bits in rD are 
cleared. The EA is placed into rA. 

Load Half Word 
and Zero 

Ihz 

rD,d(rA) 

The EA is the sum (rA 0) + d. The half word in memory addressed by the EA 
is loaded into the low-order 16 bits of rD. The remaining bits in rD are 
cleared. 

Load Half Word 
and Zero 

Indexed 

Ihzx 

rD,rA,rB 

The EA is the sum (rA 0) + (rB). The half word in memory addressed by the 
EA is loaded into the low-order 16 bits of rD. The remaining bits in rD are 
cleared. 

Load Half Word 
and Zero with 
Update 

Ihzu 

rD,d(rA) 

The EA is the sum (rA) + d. The half word in memory addressed by the EA is 
loaded into the low-order 16 bits of rD. The remaining bits in rD are cleared. 
The EA is placed into rA. 

Load Half Word 
and Zero with 
Update Indexed 

Ihzux 

rD,rA,rB 

The EA is the sum (rA) + (rB). The half word in memory addressed by the EA 
is loaded into the low-order 16 bits of rD. The remaining bits in rD are 
cleared. The EA is placed into rA. 

Load Half Word 
Algebraic 

lha 

rD,d(rA) 

The EA is the sum (rA 0) + d. The half word in memory addressed by the EA 
is loaded into the low-order 16 bits of rD. The remaining bits in rD are filled 
with a copy of the most significant bit of the loaded half word. 

Load Half Word 

Algebraic 

Indexed 

lhax 

rD,rA,rB 

The EA is the sum (rA 0) + (rB). The half word in memory addressed by the 
EA is loaded into the low-order 16 bits of rD. The remaining bits in rD are 
filled with a copy of the most significant bit of the loaded half word. 

Load Half Word 
Algebraic with 
Update 

lhau 

rD,d(rA) 

The EA is the sum (rA) + d. The half word in memory addressed by the EA is 
loaded into the low-order 1 6 bits of rD. The remaining bits in rD are filled with 
a copy of the most significant bit of the loaded half word. The EA is placed 
into rA. 

Load Half Word 
Algebraic with 
Update Indexed 

lhaux 

rD,rA,rB 

The EA is the sum (rA) + (rB). The half word in memory addressed by the EA 
is loaded into the low-order 16 bits of rD. The remaining bits in rD are filled 
with a copy of the most significant bit of the loaded half word. The EA is 
placed into rA. 

Load Word and 
Zero 

Iwz 

rD,d(rA) 

The EA is the sum (rA 0) + d. The word in memory addressed by the EA is 
loaded into rD. 





























































Table 4-13. Integer Load Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Load Word and 
Zero Indexed 

Iwzx 

rD,rA,rB 

The EA is the sum (rA 0) + (rB). The word in memory addressed by the EA is 
loaded into rD. 

Load Word and 
Zero with 

Update 

Iwzu 

rD,d(rA) 

The EA is the sum (rA) + d. The word in memory addressed by the EA is 
loaded into rD. The EA is placed into rA. 

Load Word and 
Zero with 

Update Indexed 

Iwzux 

rD,rA,rB 

The EA is the sum (rA) + (rB). The word in memory addressed by the EA is 
loaded into rD. The EA is placed into rA. 


4. 2. 3. 3 Integer Store Instructions 

For integer store instructions, the contents of rS are stored into the byte, half word, or word 
in memory addressed by the EA (effective address). Many store instructions have an update 
form, in which rA is updated with the EA. For these forms, the following rules apply: 

• If rAA), the effective address is placed into rA. 

• If rS = rA, the contents of register rS are copied to the target memory element, then 
the generated EA is placed into rA (rS). 

In general, the PowerPC architecture defines a sequential execution model. However, when 
a store instruction modifies a memory location that contains an instruction, software 
synchronization (isync) is required to ensure that subsequent instruction fetches from that 
location obtain the modified version of the instruction. 

If a program modifies the instructions it intends to execute, it should call the appropriate 
system library program before attempting to execute the modified instructions to ensure 
that the modifications have taken effect with respect to instruction fetching. 

The PowerPC architecture defines store with update instructions with rA = 0 as an invalid 
form. In addition, it defines integer store instructions with the CR update option enabled 
(Re field, bit 31, in the instruction encoding = 1) to be an invalid form. Table 4-14 provides 
a summary of the integer store instructions. 


Table 4-14. Integer Store Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Store Byte 

stb 

rS,d(rA) 

The EA is the sum (rA 0) + d. The contents of the low-order eight bits 
of rS are stored into the byte in memory addressed by the EA. 

Store Byte Indexed 

stbx 

rS,rA,rB 

The EA is the sum (rA 0) + (rB). The contents of the low-order eight 
bits of rS are stored into the byte in memory addressed by the EA. 

Store Byte with 
Update 

stbu 

rS,d(rA) 

The EA is the sum (rA) + d. The contents of the low-order eight bits of 
rS are stored into the byte in memory addressed by the EA. The EA is 
placed into rA. 
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Table 4-14. Integer Store Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Store Byte with 
Update Indexed 

stbux 

rS,rA,rB 

The EA is the sum (rA) + (rB). The contents of the low-order eight bits 
of rS are stored into the byte in memory addressed by the EA. The 

EA is placed into rA. 

Store Half Word 

sth 

rS,d(rA) 

The EA is the sum (rA 0) + d. The contents of the low-order 1 6 bits of 
rS are stored into the half word in memory addressed by the EA. 

Store Half Word 
Indexed 

sthx 

rS,rA,rB 

The EA is the sum (rA 0) + (rB). The contents of the low-order 1 6 bits 
of rS are stored into the half word in memory addressed by the EA. 

Store HalfWord with 
Update 

sthu 

rS,d(rA) 

The EA is the sum (rA) + d. The contents of the low-order 1 6 bits of rS 
are stored into the half word in memory addressed by the EA. The EA 
is placed into rA. 

Store HalfWord with 
Update Indexed 

sthux 

rS,rA,rB 

The EA is the sum (rA) + (rB). The contents of the low-order 1 6 bits of 
rS are stored into the half word in memory addressed by the EA. The 
EA is placed into rA. 

Store Word 

stw 

rS,d(rA) 

The EA is the sum (rA 0) + d. The contents of rS are stored into the 
word in memory addressed by the EA. 

Store Word Indexed 

stwx 

rS,rA,rB 

The EA is the sum (rA 0) + (rB). The contents of rS are stored into the 
word in memory addressed by the EA. 

Store Word with 
Update 

stwu 

rS,d(rA) 

The EA is the sum (rA) + d. The contents of rS are stored into the 
word in memory addressed by the EA. The EA is placed into rA. 

Store Word with 
Update Indexed 

stwux 

rS,rA,rB 

The EA is the sum (rA) + (rB). The contents of rS are stored into the 
word in memory addressed by the EA. The EA is placed into rA. 


4. 2. 3. 4 Integer Load and Store with Byte-Reverse Instructions 

Table 4-15 describes integer load and store with byte-reverse instructions. 

NOTE: In some PowerPC implementations, load byte-reverse instructions may have 
greater latency than other load instructions. 

When used in a PowerPC system operating with the default big-endian byte order, these 
instructions have the effect of loading and storing data in little-endian order. Likewise, 
when used in a PowerPC system operating with little-endian byte order, these instructions 
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have the effect of loading and storing data in big-endian order. For more information about 
big-endian and little-endian byte ordering, see Section 3.1.2, “Byte Ordering.” 


Table 4-15. Integer Load and Store with Byte-Reverse Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Load Half 
Word Byte- 
Reverse 
Indexed 

Ihbrx 

rD,rA,rB 

The EA is the sum (rA|0) + (rB). The high-order eight bits of the half word 
addressed by the EA are loaded into the low-order eight bits of rD. The next eight 
higher-order bits of the half word in memory addressed by the EA are loaded into 
the next eight lower-order bits of rD. The remaining rD bits are cleared. 

Load Word 
Byte- 
Reverse 
Indexed 

Iwbrx 

rD,rA,rB 

The EA is the sum (rA|0) + (rB). Bits 0-7 of the word in memory addressed by 
the EA are loaded into the low-order eight bits of rD. Bits 8-1 5 of the word in 
memory addressed by the EA are loaded into bits 1 6-23 of rD. Bits 1 6-23 of the 
word in memory addressed by the EA are loaded into bits 8-15. Bits 24-31 of 
the word in memory addressed by the EA are loaded into bits 0-7. 

Store Half 
Word Byte- 
Reverse 
Indexed 

sthbrx 

rS,rA,rB 

The EA is the sum (rA|0) + (rB). The contents of the low-order eight bits(24-31) 
of rS are stored into the high-order eight bits(0-7) of the half word in memory 
addressed by the EA. The contents of the next lower-order eight bits(1 6-23) of rS 
are stored into the next eight bits(8-1 5) of the half word in memory addressed by 
the EA. 

Store 

Word Byte- 

Reverse 

Indexed 

stwbrx 

rS,rA,rB 

The effective address is the sum (rA 0) + (rB). The contents of the low-order 
eight bits (24-31 ) of rS are stored into bits 0-7 of the word in memory addressed 
by EA. The contents of the next eight lower-order bits(1 6-23) of rS are stored into 
bits 8-1 5 of the word in memory addressed by the EA. The contents of the next 
eight lower-order bits(8-15) of rS are stored into bits 16-23 of the word in 
memory addressed by the EA. The contents of the next eight bits(0-7) of rS are 
stored into bits 24-31 of the word addressed by the EA. 


4. 2. 3. 5 Integer Load and Store Multiple Instructions 

The load/store multiple instructions are used to move blocks of data to and from the GPRs. 
The load multiple and store multiple instructions may have operands that require memory 
accesses crossing a 4-Kbyte page boundary. As a result, these instructions may be 
interrupted by a DSI exception associated with the address translation of the second page. 
Table 4-16 summarizes the integer load and store multiple instructions. 

In the load/store multiple instructions, the combination of the EA and rD (rS) is such that 
the low-order byte of GPR3 1 is loaded from or stored into the last byte of an aligned quad 
word in memory; if the effective address is not correctly aligned, it may take significantly 
longer to execute. 

In some PowerPC implementations operating with little-endian byte order, execution of an 
lmw or stmw instruction causes the system alignment error handler to be invoked; see 
Section 3.1.2, “Byte Ordering,” for more information. 
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The PowerPC architecture defines the load multiple word (lmw) instruction with rA in the 
range of registers to be loaded, including the case in which rA = 0, as an invalid form. 


Table 4-16. Integer Load and Store Multiple Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Load Multiple Word 

lmw 

rD,d(rA) 

The EA is the sum (rA 0) + d. n = (32 - rD). 

Store Multiple Word 

stmw 

rS,d(rA) 

The EA is the sum (rA 0) + d. n = (32 - rS). 


4. 2. 3. 6 Integer Load and Store String Instructions 

The integer load and store string instructions allow movement of data from memory to 
registers or from registers to memory without concern for alignment. These instructions can 
be used for a short move between arbitrary memory locations or to initiate a long move 
between misaligned memory fields. However, in some implementations, these instructions 
are likely to have greater latency and take longer to execute, perhaps much longer, than a 
sequence of individual load or store instructions that produce the same results. Table 4-17 
summarizes the integer load and store string instructions. 

Load and store string instructions execute more efficiently when rD or rS = 5, and the last 
register loaded or stored is less than or equal to 12. 

In some PowerPC implementations operating with little-endian byte order, execution of a 
load or string instruction causes the system alignment error handler to be invoked; see 
Section 3.1.2, “Byte Ordering,” for more information. 


Table 4-17. Integer Load and Store String Instructions 


Name 

Mnemonic 

Operand Syntax 

Operation 

Load String Word Immediate 

Iswi 

rD,rA,NB 

The EA is (rA|0). 

Load String Word Indexed 

Iswx 

rD,rA,rB 

The EA is the sum (rA 0) + (rB). 

Store String Word Immediate 

stswi 

rS,rA,NB 

The EA is (rA|0). 

Store String Word Indexed 

stswx 

rS,rA,rB 

The EA is the sum (rA 0) + (rB). 


Load string and store string instructions may involve operands that are not word-aligned. 
As described in Section 6.4.6, “Alignment Exception (0x00600),” a misaligned string 
operation suffers a performance penalty compared to an aligned operation of the same type. 
A non-word-aligned string operation that crosses a double-word boundary is also slower 
than a word-aligned string operation. 

4. 2. 3. 7 Floating-Point Load and Store Address Generation 

Floating-point load and store operations generate effective addresses using the register 
indirect with immediate index addressing mode and register indirect with index addressing 
mode. Floating-point loads and stores are not supported for direct-store interface accesses. 
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The use of floating-point loads and stores for direct-store interface accesses results in an 
alignment exception. 

NOTE: The direct-store facility is being phased out of the architecture and is not likely 
to be supported in future devices. 

4. 2. 3. 7.1 Register Indirect (contents) with Immediate Index Addressing 
for Floating-Point Loads and Stores 

Instructions using this addressing mode contain a signed 16-bit immediate index 
(d operand) which is sign extended to 32 bits, and added to the contents of a GPR specified 
in the instruction (rA operand) to generate the effective address. If the rA field of the 
instruction specifies rO, a value of zero is added to the immediate index (d operand) in place 
of the contents of rO. The option to specify rA or 0 is shown in the instruction descriptions 
as (rAIO). 

Figure 4-4 shows how an effective address is generated when using register indirect with 
immediate index addressing for floating-point loads and stores. 
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Figure 4-4. Register Indirect with Immediate Index Addressing for Floating-Point 

Loads/Stores 

4. 2. 3. 7.2 Register Indirect (contents) with Index Addressing for Floating- 
Point Loads and Stores 

Instructions using this addressing mode add the contents of two GPRs (specified in 
operands rA and rB) to generate the effective address. A zero in the rA operand causes a 
zero to be added to the contents of the GPR specified in operand rB. This is shown in the 
instruction descriptions as (rAIO). 
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Figure 4-5 shows how an effective address is generated when using register indirect with 
index addressing. 
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Figure 4-5. Register Indirect with Index Addressing for Floating-Point Loads/Stores 


The PowerPC architecture defines floating-point load and store with update instructions 
(lfsu, lfsux, lfdu, lfdux, stfsu, stfsux, stfdu, stfdux) with operand rA = 0 as invalid forms 
of the instructions. In addition, it defines floating-point load and store instructions with the 
CR updating option enabled (Re bit, bit 31 = 1) to be an invalid form. 

The PowerPC architecture defines that the FPSCR[UE] bit should not be used to determine 
whether denormalization should be performed on floating-point stores. 

4. 2. 3.8 Floating-Point Load Instructions 

There are two forms of the floating-point load instruction — single-precision and double- 
precision operand formats. Because the FPRs support only the floating-point double- 
precision format, single-precision floating-point load instructions convert single-precision 
data to double-precision format before loading the operands into the target FPR. This 
conversion is described fully in Section D.6, “Floating-Point Load Instructions.” 
Table 4-18 provides a summary of the floating-point load instructions. 

NOTE: The PowerPC architecture defines load with update instructions with rA = 0 as 
an invalid form. 
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Table 4-18. Floating-Point Load Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Load Floating- 
Point Single 

Its 

frD,d(rA) 

The EA is the sum (rA 0) + d. 

The word in memory addressed by the EA is interpreted as a floating-point 
single-precision operand. This word is converted to floating-point double- 
precision format and placed into frD. 

Load Floating- 
Point Single 
Indexed 

Ifsx 

frD,rA,rB 

The EA is the sum (rA 0) + (rB). 

The word in memory addressed by the EA is interpreted as a floating-point 
single-precision operand. This word is converted to floating-point double- 
precision format and placed into frD. 

Load Floating- 
Point Single 
with Update 

Ifsu 

frD,d(rA) 

The EA is the sum (rA) + d. 

The word in memory addressed by the EA is interpreted as a floating-point 
single-precision operand. This word is converted to floating-point double- 
precision format and placed into frD. 

The EA is placed into the register specified by rA. 

Load Floating- 
Point Single 
with Update 
Indexed 

Ifsux 

frD,rA,rB 

The EA is the sum (rA) + (rB). 

The word in memory addressed by the EA is interpreted as a floating-point 
single-precision operand. This word is converted to floating-point double- 
precision format and placed into frD. 

The EA is placed into the register specified by rA. 

Load Floating- 
Point Double 

ltd 

frD,d(rA) 

The EA is the sum (rA 0) + d. 

The double word in memory addressed by the EA is placed into register 

frD. 

Load Floating- 
Point Double 
Indexed 

Ifdx 

frD,rA,rB 

The EA is the sum (rA 0) + (rB). 

The double word in memory addressed by the EA is placed into register 

frD. 

Load Floating- 
Point Double 
with Update 

Ifdu 

frD,d(rA) 

The EA is the sum (rA) + d. 

The double word in memory addressed by the EA is placed into register 

frD. 

The EA is placed into the register specified by rA. 

Load Floating- 
Point Double 
with Update 
Indexed 

Ifdux 

frD,rA,rB 

The EA is the sum (rA) + (rB). 

The double word in memory addressed by the EA is placed into register 

frD. 

The EA is placed into the register specified by rA. 



4. 2. 3. 9 Floating-Point Store Instructions 

This section describes floating-point store instructions. There are three basic forms of the 
store instruction — single-precision, double-precision, and integer. The integer form is 
supported by the stfiwx instruction. 
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NOTE: The stfiwx instruction is defined as optional by the PowerPC architecture to 

ensure backwards compatibility with earlier processors; however, it will likely be 
required for subsequent PowerPC processors. 

Because the FPRs support only floating-point, double-precision format for floating-point 
data, single-precision floating-point store instructions convert double-precision data to 
single-precision format before storing the operands. The conversion steps are described 
fully in Section D.7, “Floating-Point Store Instructions.” Table 4-19 provides a summary of 
the floating-point store instructions. 

NOTE: The PowerPC architecture defines store with update instructions with rA = 0 as 
an invalid form. 


Table 4-19 provides the floating-point store instructions for the PowerPC processors. 

Table 4-19. Floating-Point Store Instructions 


Name 

Mnemonic 

Operand Syntax 

Operation 

Store Floating- 
Point Single 

stfs 

frS.d(rA) 

The EA is the sum (rA|0) + d. 

The contents of frS are converted to single-precision and stored 
into the word in memory addressed by the EA. 

Store Floating- 
Point Single 
Indexed 

stfsx 

frS,rA,rB 

The EA is the sum (rA|0) + (rB). 

The contents of frS are converted to single-precision and stored 
into the word in memory addressed by the EA. 

Store Floating- 
Point Single 
with Update 

stfsu 

frS,d(rA) 

The EA is the sum (rA) + d. 

The contents of frS are converted to single-precision and stored 
into the word in memory addressed by the EA. 

The EA is placed into rA. 

Store Floating- 
Point Single 
with Update 
Indexed 

stfsux 

frS,rA,rB 

The EA is the sum (rA) + (rB). 

The contents of frS are converted to single-precision and stored 
into the word in memory addressed by the EA. 

The EA is placed into the rA. 

Store Floating- 
Point Double 

stfd 

frS.d(rA) 

The EA is the sum (rA|0) + d. 

The contents of frS are stored into the double word in memory 
addressed by the EA. 

Store Floating- 
Point Double 
Indexed 

stfdx 

frS,rA,rB 

The EA is the sum (rA|0) + (rB). 

The contents of frS are stored into the double word in memory 
addressed by the EA. 

Store Floating- 
Point Double 
with Update 

stfdu 

frS.d(rA) 

The EA is the sum (rA) + d. 

The contents of frS are stored into the double word in memory 
addressed by the EA. 

The EA is placed into rA. 




































Table 4-19. Floating-Point Store Instructions (Continued) 


Name 

Mnemonic 

Operand Syntax 

Operation 

Store Floating- 
Point Double 
with Update 
Indexed 

stfdux 

frS,rA,rB 

The EA is the sum (rA) + (rB). 

The contents of frS are stored into the double word in memory 
addressed by EA. 

The EA is placed into register rA. 

Store Floating- 
Point as 

Integer Word 
Indexed 

stfiwx 

frS,rA,rB 

The EA is the sum (rA 0) + (rB). 

The contents of the low-order 32 bits of frS are stored, without 
conversion, into the word in memory addressed by the EA. 

Note: The stfiwx instruction is defined as optional by the PowerPC 
architecture to ensure backwards compatibility with earlier 
processors; however, it will likely be required for subsequent 
PowerPC processors. 


4.2.4 Branch and Flow Control Instructions 

Some branch instructions can redirect instruction execution conditionally based on the 
value of bits in the CR. When the processor encounters one of these instructions, it scans 
the execution pipelines to determine whether an instruction in progress may affect the 
particular CR bit. If no interlock is found, the branch can be resolved immediately by 
checking the bit in the CR and taking the action defined for the branch instruction. 

If an interlock is detected, the branch is considered unresolved and the direction of the 
branch may either be predicted using the y bit (as described in Table 4-20) or by using 
dynamic prediction. The interlock is monitored while instructions are fetched for the 
predicted branch. When the interlock is cleared, the processor determines whether the 
prediction was correct based on the value of the CR bit. If the prediction is correct, the 
branch is considered completed and instruction fetching continues along the predicted path. 
If the prediction is incorrect, the fetched instructions are purged, and instruction fetching 
continues along the alternate path. 

4.2.4. 1 Branch Instruction Address Calculation 

Branch instructions can alter the sequence of instruction execution. Instruction addresses 
are always assumed to be word aligned; the PowerPC processors ignore the two low-order 
bits of the generated branch target address. 

Branch instructions compute the effective address (EA) of the next instruction address 
using the following addressing modes: 

• Branch relative 

• Branch conditional to relative address 

• Branch to absolute address 

• Branch conditional to absolute address 

• Branch conditional to link register 

• Branch conditional to count register 
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4. 2. 4. 1.1 Branch Relative Addressing Mode 

Instructions that use branch relative addressing generate the next instruction address by 
sign extending and appending ObOO to the immediate displacement operand LI, and adding 
the resultant value to the current instruction address. Branches using this addressing mode 
have the absolute addressing option disabled (AA field, bit 30, in the instruction 
encoding = 0). The link register (LR) update option can be enabled (LK field, bit 31, in the 
instruction encoding = 1). This option causes the effective address of the instruction 
following the branch instruction to be placed in the LR. 

Figure 4-6 shows how the branch target address is generated when using the branch relative 
addressing mode. 
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Figure 4-6. Branch Relative Addressing 

4. 2. 4. 1.2 Branch Conditional to Relative Addressing Mode 

If the branch conditions are met, instructions that use the branch conditional to relative 
addressing mode generate the next instruction address by sign extending and appending 
ObOO to the immediate displacement operand (BD) and adding the resultant value to the 
current instruction address. Branches using this addressing mode have the absolute 
addressing option disabled (AA field, bit 30, in the instruction encoding = 0). The link 
register update option can be enabled (LK field, bit 31, in the instruction encoding = 1). 
This option causes the effective address of the instruction following the branch instruction 
to be placed in the LR. 
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Figure 4-7 shows how the branch target address is generated when using the branch 
conditional relative addressing mode. 
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Figure 4-7. Branch Conditional Relative Addressing 
4. 2. 4. 1.3 Branch to Absolute Addressing Mode 

Instructions that use branch to absolute addressing mode generate the next instruction 
address by sign extending and appending ObOO to the LI operand. Branches using this 
addressing mode have the absolute addressing option enabled (AA field, bit 30, in the 
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instruction encoding =1). The link register update option can be enabled (LK field, bit 31, 
in the instruction encoding =1). This option causes the effective address of the instruction 
following the branch instruction to be placed in the LR. 

Figure 4-8 shows how the branch target address is generated when using the branch to 
absolute addressing mode. 
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Figure 4-8. Branch to Absolute Addressing 

4. 2. 4. 1.4 Branch Conditional to Absolute Addressing Mode 

If the branch conditions are met, instructions that use the branch conditional to absolute 
addressing mode generate the next instruction address by sign extending and appending 
ObOO to the BD operand. 

Branches using this addressing mode have the absolute addressing option enabled (AA 
field, bit 30, in the instruction encoding = 1). 

The link register update option can be enabled (LK field, bit 31, in the instruction 
encoding = 1). 

This option causes the effective address of the instruction following the branch instruction 
to be placed in the LR. 
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Figure 4-9 shows how the branch target address is generated when using the branch 
conditional to absolute addressing mode. 
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Figure 4-9. Branch Conditional to Absolute Addressing 

4. 2. 4. 1.5 Branch Conditional to Link Register Addressing Mode 

If the branch conditions are met, the branch conditional to link register instruction generates 
the next instruction address by using the contents of the LR and clearing the two low-order 
bits to zero. The result becomes the effective address from which the next instructions are 
fetched. 

The link register update option can be enabled (LK field, bit 31, in the instruction encoding 
= 1). This option causes the effective address of the instruction following the branch 
instruction to be placed in the LR. This is done even if the branch is not taken. 
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Figure 4-10 shows how the branch target address is generated when using the branch 
conditional to link register addressing mode. 
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Figure 4-10. Branch Conditional to Link Register Addressing 

4. 2. 4. 1.6 Branch Conditional to Count Register Addressing Mode 

If the branch conditions are met, the branch conditional to count register instruction 
generates the next instruction address by using the contents of the count register (CTR) and 
clearing the two low-order bits to zero. The result becomes the effective address from which 
the next instructions are fetched. 

The link register update option can be enabled (LK field, bit 31, in the instruction 
encoding = 1). This option causes the effective address of the instruction following the 
branch instruction to be placed in the LR. This is done even if the branch is not taken. 

Figure 4-11 shows how the branch target address is generated when using the branch 
conditional to count register addressing mode. 
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Figure 4-11. Branch Conditional to Count Register Addressing 
4. 2. 4. 2 Conditional Branch Control 

For branch conditional instructions, the BO operand specifies the conditions under which 
the branch is taken. The first four bits of the BO operand specify how the branch is affected 
by or affects the condition and count registers. The fifth bit, shown in Table 4-20 as having 
the value y, is used by some PowerPC implementations for branch prediction as described 
below. 

The encodings for the BO operands are shown in Table 4-20. If the BO field specifies that 
the CTR is to be decremented, the entire 32-bit CTR is decremented. 


Table 4-20. BO Operand Encodings 


BO 

Description 

OOOOy 

Decrement the CTR, then branch if the decremented CTR^O and the condition is FALSE. 

OOOly 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is FALSE. 

001 zy 

Branch if the condition is FALSE. 

OlOOy 

Decrement the CTR, then branch if the decremented CTR^O and the condition is TRUE. 

OlOly 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is TRUE. 

Ollzy 

Branch if the condition is TRUE. 

IzOOy 

Decrement the CTR, then branch if the decremented CTR^O. 

IzOly 

Decrement the CTR, then branch if the decremented CTR = 0. 

Izlzz 

Branch always. 
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Table 4-20. BO Operand Encodings (Continued) 


BO 


Description 


In this table, z indicates a bit that is ignored. 

Note: The z bits should be cleared, as they may be assigned a meaning in some future version of the 
PowerPC architecture. 


The y bit provides a hint about whether a conditional branch is likely to be taken, and may be used by some 
PowerPC implementations to improve performance. 


The branch always encoding of the BO operand does not have a y bit. 

Clearing the y bit indicates a predicted behavior for the branch instruction as follows: 

• For bc.\ with a negative value in the displacement operand, the branch is predicted 
taken. 

• In all other cases (bcx with a non-negative value in the displacement operand, bclr.i, 
or bcctrx), the branch is predicted not taken. 

Setting the y bit reverses the preceding indications. 

The sign of the displacement operand is used as described above even if the target is an 
absolute address. The default value for the y bit should be 0, and should only be set to 1 if 
software has determined that the prediction corresponding to y = 1 is more likely to be 
correct than the prediction corresponding to y = 0. Software that does not compute branch 
predictions should clear the y bit. 

In most cases, the branch should be predicted to be taken if the value of the following 
expression is 1, and predicted to fall through if the value is 0. 

((BO[0] & BO[2]) I S) = BO [4] 

In the expression above, S (bit 16 of the branch conditional instruction coding) is the sign 
bit of the displacement operand if the instruction has a displacement operand and is 0 if the 
operand is reserved. BO[4] is the y bit, or 0 for the branch always encoding of the BO 
operand. (Advantage is taken of the fact that, for bclrv and bcctr.i, bit 16 of the instruction 
is part of a reserved operand and therefore must be 0.) 

The 5-bit BI operand in branch conditional instructions specifies which of the 32 bits in the 
CR represents the bit to test. 

When the branch instructions contain immediate addressing operands, the branch target 
addresses can be computed sufficiently ahead of the branch execution and instructions can 
be fetched along the branch target path (if the branch is predicted to be taken or is an 
unconditional branch). If the branch instructions use the link or count register contents for 
the branch target address, instructions along the branch-taken path of a branch can be 
fetched if the link or count register is loaded sufficiently ahead of the branch instruction 
execution. 
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Branching can be conditional or unconditional. The branch target address is first calculated 
from the contents of the count or link register or from the branch immediate field. 
Optionally, a branch return address can be loaded into the LR register (this sets the return 
address for subroutine calls). When this option is selected (LK=1) the LR is loaded with the 
effective address of the instruction following the branch instruction. 

Some processors may keep a stack of the link register values most recently set by branch 
and link instructions, with the possible exception of the form shown below for obtaining 
the address of the next instruction. To benefit from this stack, the following programming 
conventions should be used. 

In the following examples, let A, B, and Glue represent subroutine labels: 

• Obtaining the address of the next instruction- use the following form of branch and 
link: 

bcl 20, 31, $+4 

• Loop counts: 

Keep loop counts in the count register, and use one of the branch conditional 
instructions to decrement the count and to control branching (for example, 
branching back to the start of a loop if the decremented counter value is nonzero). 

• Computed GOTOs, case statements, etc.: 

Use the count register to hold the address to branch to, and use the bcctr instruction 
with the link register option disabled (LK = 0) to branch to the selected address. 

• Direct subroutine linkage — where A calls B and B returns to A. The two branches 
should be as follows: 

— A calls B: use a branch instruction that enables the link register (LK =1). 

— B returns to A: use the bclr instruction with the link register option disabled 
(LK = 0) (the return address is in, or can be restored to, the link register). 

• Indirect subroutine linkage: 

Where A calls Glue, Glue calls B, and B returns to A rather than to Glue. (Such a 
calling sequence is common in linkage code used when the subroutine that the 
programmer wants to call, here B, is in a different module from the caller: the binder 
inserts “glue” code to mediate the branch.) The three branches should be as follows: 

— A calls Glue: use a branch instruction that sets the link register with the link 
register option enabled (LK =1). 

— Glue calls B: place the address of B in the count register, and use the bcctr 
instruction with the link register option disabled (LK = 0). 

— B returns to A: use the bclr instruction with the link register option disabled 
(LK = 0) (the return address is in, or can be restored to, the link register). 
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4. 2. 4. 3 Branch Instructions 

Table 4-21 describes the branch instructions provided by the PowerPC processors. 


Table 4-21. Branch Instructions 


Name 

Mnemonic 

Operand Syntax 

Operation 

Branch 

b 

target addr 

b 

Branch. Branch to the address computed as the sum of the 


ba 



immediate address and the address of the current instruction. 


bl 


ba 

Branch Absolute. Branch to the absolute address specified. 


bla 


bl 

Branch then Link. Branch to the address computed as the sum 
of the immediate address and the address of the current 
instruction. The instruction address following this instruction is 
placed into the link register (LR). 




bla 

Branch Absolute then Link. Branch to the absolute address 
specified. The instruction address following this instruction is 
placed into the LR. 

Branch 

be 

BO,BI,target_addr 

The Bl operand specifies the bit in the CR to be used as the condition 

Conditional 

bca 


of the branch. The BO operand is used as described in Table 4-20. 


bcl 

bcla 


be 

Branch Conditional. Branch conditionally to the address 
computed as the sum of the immediate address and the 
address of the current instruction. 




bca 

Branch Conditional Absolute. Branch conditionally to the 
absolute address specified. 




bcl 

Branch Conditional then Link. Branch conditionally to the 
address computed as the sum of the immediate address and 
the address of the current instruction. The instruction address 
following this instruction is placed into the LR. 




bcla 

Branch Conditional Absolute then Link. Branch conditionally to 
the absolute address specified. The instruction address 
following this instruction is placed into the LR. 

Branch 

bclr 

BO,BI 

The Bl operand specifies the bit in the CR to be used as the condition 

Conditional 

bclrl 


of the branch. The BO operand is used as described in Table 4-20, 

to Link 



and the branch target address is LR[0-29] || ObOO. 

Register 



bclr 

Branch Conditional to Link Register. Branch conditionally to 
the address in the LR. 




bclrl 

Branch Conditional to Link Register then Link. Branch 
conditionally to the address specified in the LR. The instruction 
address following this instruction is then placed into the LR. 

Branch 

beetr 

BO,BI 

The Bl operand specifies the bit in the CR to be used as the condition 

Conditional 

bcctrl 


of the branch. The BO operand is used as described in Table 4-20, 

to Count 



and the branch target address is CTR[0-29] || ObOO. 

Register 



beetr 

Branch Conditional to Count Register. Branch conditionally to 
the address specified in the count register. 




bcctrl 

Branch Conditional to Count Register then Link. Branch 
conditionally to the address specified in the count register. 

The instruction address following this instruction is placed into 
the LR. 




Note: If the “decrement and test CTR” option is specified (BO[2] = 0), 
the instruction form is invalid. 

























4. 2. 4.4 Simplified Mnemonics for Branch Processor Instructions 

To simplify assembly language programming, a set of simplified mnemonics and symbols 
is provided for the most frequently used forms of branch conditional, compare, trap, rotate 
and shift, and certain other instructions. See Appendix F, “Simplified Mnemonics,” for a 
list of simplified mnemonic examples. 

4. 2. 4. 5 Condition Register Logical Instructions 

Condition register logical instructions, shown in Table 4-22, and the Move Condition 
Register Field (mcrf) instruction are also defined as flow control instructions. 

NOTE: If the LR update option is enabled for any of these instructions, the PowerPC 
architecture defines these forms of the instructions as invalid. 


Table 4-22. Condition Register Logical Instructions 


Name 

Mnemonic 

Operand Syntax 

Operation 

Condition 

Register AND 

crand 

crbD,crbA,crbB 

The CR bit specified by crbA is ANDed with the CR bit specified 
by crbB. The result is placed into the CR bit specified by crbD. 

Condition 

Register OR 

cror 

crbD,crbA,crbB 

The CR bit specified by crbA is ORed with the CR bit specified 
by crbB. The result is placed into the CR bit specified by crbD. 

Condition 

Register XOR 

crxor 

crbD,crbA,crbB 

The CR bit specified by crbA is XORed with the CR bit specified 
by crbB. The result is placed into the CR bit specified by crbD. 

Condition 

Register NAND 

crnand 

crbD,crbA,crbB 

The CR bit specified by crbA is ANDed with the CR bit specified 
by crbB. The complemented result is placed into the CR bit 
specified by crbD. 

Condition 

Register NOR 

crnor 

crbD,crbA,crbB 

The CR bit specified by crbA is ORed with the CR bit specified 
by crbB. The complemented result is placed into the CR bit 
specified by crbD. 

Condition 

Register 

Equivalent 

creqv 

crbD,crbA, crbB 

The CR bit specified by crbA is XORed with the CR bit specified 
by crbB. The complemented result is placed into the CR bit 
specified by crbD. 

Condition 

Register AND 
with Complement 

crandc 

crbD,crbA, crbB 

The CR bit specified by crbA is ANDed with the complement of 
the CR bit specified by crbB and the result is placed into the CR 
bit specified by crbD. 

Condition 

Register OR with 
Complement 

crorc 

crbD, crbA, crbB 

The CR bit specified by crbA is ORed with the complement of 
the CR bit specified by crbB and the result is placed into the CR 
bit specified by crbD. 

Move Condition 
Register Field 

mcrf 

crfD,crfS 

The contents of crfS are copied into crfD. No other condition 
register fields are changed. 












































4. 2. 4. 6 Trap Instructions 

The trap instructions shown in Table 4-23 are provided to test for a specified set of 
conditions. If any of the conditions tested by a trap instruction are met, the system trap 
handler is invoked. If the tested conditions are not met, instruction execution continues 
normally. See Appendix F, “Simplified Mnemonics,” for a complete set of simplified 
mnemonics. 


Table 4-23. Trap Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operand Syntax 

Trap Word 
Immediate 

twi 

TO, rA, SIMM 

The contents of rA are compared with the sign-extended SIMM operand. 

If any bit in the TO operand is set and its corresponding condition is met 
by the result of the comparison, the system trap handler is invoked. 

Trap Word 

tw 

TO,rA,rB 

The contents of rA are compared with the contents of rB. If any bit in the 
TO operand is set and its corresponding condition is met by the result of 
the comparison, the system trap handler is invoked. 



4. 2. 4. 7 System Linkage Instruction — UISA 

Table 4-24 describes the System Call (sc) instruction that permits a program to call on the 
system to perform a service. See Section 4.4.1, “System Linkage Instructions — OEA,” for 
a complete description of the sc instruction. 


Table 4-24. System Linkage Instruction — UISA 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

System 

Call 

sc 


This instruction calls the operating system to perform a service. When control is 
returned to the program that executed the system call, the content of the registers 
will depend on the register conventions used by the program providing the system 
service. This instruction is context synchronizing as described in Section 4. 1.5.1, 
“Context Synchronizing Instructions.” 

See Section 4.4.1 , “System Linkage Instructions — OEA,” for a complete description 
of the sc instruction. 
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4.2.5 Processor Control Instructions — UISA 


u 

V 

o 


Processor control instructions are used to read from and write to the condition register 
(CR), machine state register (MSR), and special-purpose registers (SPRs). See 
Section 4.3.1, “Processor Control Instructions — VEA,” for the mftb instruction and 
Section 4.4.2, “Processor Control Instructions — OEA,” for information about the 
instructions used for reading from and writing to the MSR and SPRs. 


4.2.5. 1 Move to/from Condition Register Instructions 

Table 4-25 summarizes the instructions for reading from or writing to the condition register. 


Table 4-25. Move to/from Condition Register Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Move to Condition 
Register Fields 

mtcrf 

CRM,rS 

The contents of rS are placed into the CR under control of the field 
mask specified by operand CRM. The field mask identifies the 4-bit 
fields affected. Let /'be an integer in the range 0-7. If CRM(/) = 1, CR 
field / (CR bits 4 * / through 4 * /+ 3) is set to the contents of the 
corresponding field of rS. 

Move to Condition 
Register from XER 

mcrxr 

crfD 

The contents of XER[0-3] are copied into the condition register field 
designated by crfD. All other CR fields remain unchanged. The 
contents of XER[0-3] are cleared. 

Move from 

Condition Register 

mfcr 

rD 

The contents of the CR are placed into rD. 


4. 2. 5. 2 Move to/from Special-Purpose Register Instructions (UISA) 

Table 4-26 provides a brief description of the mtspr and mfspr instructions. For more 
detailed information refer to Chapter 8, “Instruction set.” 


Table 4-26. Move to/from Special-Purpose Register Instructions (UISA) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Move to Special- 
Purpose Register 

mtspr 

SPR.rS 

The value specified by rS are placed in the specified SPR. 

Move from Special- 
Purpose Register 

mfspr 

rD,SPR 

The contents of the specified SPR are placed in rD. 
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4.2.6 Memory Synchronization Instructions — UISA 

Memory synchronization instructions control the order in which memory operations are 
completed with respect to asynchronous events, and the order in which memory operations 
are seen by other processors or memory access mechanisms. 

The number of cycles required to complete a sync instruction depends on system 
parameters and on the processor's state when the instruction is issued. As a result, frequent 
use of this instruction may degrade performance slightly. The eieio instruction may be more 
appropriate than sync for many cases. 

The PowerPC architecture defines the sync instruction with CR update enabled (Re field, 
bit 31 = 1) to be an invalid form. 


The proper paired use of the lwarx with stwcx. instructions allows programmers to emulate 
common semaphore operations such as test and set, compare and swap, exchange memory, 
and fetch and add. Examples of these semaphore operations can be found in Appendix E, 
“Synchronization Programming Examples.” The lwarx instruction must be paired with an 
stwcx. instruction, with the same effective address specified by both instructions of the pair. 
The only exception is that an unpaired stwcx. instruction to any (scratch) effective address 
can be used to clear any reservation held by the processor. 




NOTE: The reservation granularity is implementation-dependent. 

The concept behind the use of the lwarx and stwcx., instructions is that a processor may 
load a semaphore from memory, compute a result based on the value of the semaphore, and 
conditionally store it back to the same location. The conditional store is performed based 
upon the existence of a reservation established by the preceding lwarx instruction. If the 
reservation exists when the store is executed, the store is performed and a bit is set in the 
CR. If the reservation does not exist when the store is executed, the target memory location 
is not modified and a bit is cleared in the CR. 


The lwarx and stwcx., primitives allow software to read a semaphore, compute a result 
based on the value of the semaphore, store the new value back into the semaphore location 
only if that location has not been modified since it was first read, and determine if the store 
was successful. If the store was successful, the sequence of instructions from the read of the 
semaphore to the store that updated the semaphore appear to have been executed atomically 
(that is, no other processor or mechanism modified the semaphore location between the 
read and the update), thus providing the equivalent of a real atomic operation. However, in 
reality, other processors may have read from the location during this operation. 

The lwarx and stwcx. instructions require the EA to be aligned. 

In general, the lwarx and stwcx. instructions should be used only in system programs, 
which can be invoked by application programs as needed. 

At most one reservation exists simultaneously on any processor. The address associated 
with the reservation can be changed by a subsequent lwarx instruction. The conditional 
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store is performed based upon the existence of a reservation established by the preceding 
lwarx instruction. 


A reservation held by the processor is cleared (or may be cleared, in the case of the fourth 
and fifth bullet items) by one of the following: 

• The processor holding the reservation executes another lwarx instruction; this clears 
the first reservation and establishes a new one. 

• The processor holding the reservation executes a stwcx. instruction whether its 
address matches that of the lwarx. 

• Some other processor executes a store or dcbz to the same reservation granule, or 
modifies a referenced or changed bit in the same reservation granule. 

• Some other processor executes a dcbtst, dcbst, dcbf, or dcbi to the same reservation 
granule; whether the reservation is cleared is undefined. 

• Some other processor executes a dcba to the same reservation granule. The 
reservation is cleared if the instruction causes the target block to be newly 
established in the data cache or to be modified; otherwise, whether the reservation 
is cleared is undefined. 

• Some other mechanism modifies a memory location in the same reservation granule. 

NOTE: Exceptions do not clear reservations; however, system software invoked by 
exceptions may clear reservations. 



Table 4-27 summarizes the memory synchronization instructions as defined in the UISA. 
See Section 4.3.2, “Memory Synchronization Instructions — VEA,” for details about 
additional memory synchronization (eieio and isync) instructions. 


Table 4-27. Memory Synchronization Instructions — UISA 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Load Word 
and Reserve 
Indexed 

lwarx 

rD,rA,rB 

The EA is the sum (rA 0) + (rB). The word in memory addressed by the EA is 
loaded into rD. 

Store Word 
Conditional 
Indexed 

stwcx. 

rS,rA,rB 

The EA is the sum (rA 0) + (rB). 

If a reservation exists and the effective address specified by the stwcx. 
instruction is the same as that specified by the load and reserve instruction 
that established the reservation, the contents of rS are stored into the word in 
memory addressed by the EA, and the reservation is cleared. 

If a reservation exists but the effective address specified by the stwcx. 
instruction is not the same as that specified by the load and reserve 
instruction that established the reservation, the reservation is cleared, and it is 
undefined whether the contents of rS are stored into the word in memory 
addressed by the EA. 

If a reservation does not exist, the instruction completes without altering 
memory or the contents of the cache. 
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Table 4-27. Memory Synchronization Instructions — UISA (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Synchronize 

sync 


Executing a sync instruction ensures that all instructions preceding the sync 
instruction appear to have completed before the sync instruction completes, 
and that no subsequent instructions are initiated by the processor until after 
the sync instruction completes. When the sync instruction completes, all 
memory accesses caused by instructions preceding the sync instruction will 
have been performed with respect to all other mechanisms that access 
memory. 

See Chapter 8, “Instruction set,” for more information. 


4.2.7 Recommended Simplified Mnemonics 

To simplify assembly language programs, a set of simplified mnemonics is provided for 
some of the most frequently used operations (such as no-op, load immediate, load address, 
move register, and complement register). Assemblers should provide the simplified 
mnemonics listed in Section F.9, “Recommended Simplified Mnemonics.” Programs 
written to be portable across the various assemblers for the PowerPC architecture should 
not assume the existence of mnemonics not described in this document. 

For a complete list of simplified mnemonics, see Appendix F, “Simplified Mnemonics.” 


4.3 PowerPC VEA Instructions 


u 

V 

o 


The PowerPC virtual environment architecture (VEA) describes the semantics of the 
memory model that can be assumed by software processes, and includes descriptions of the 
cache model, cache-control instructions, address aliasing, and other related issues. 
Implementations that conform to the VEA also adhere to the UISA, but may not necessarily 
adhere to the OEA. 


This section describes additional instructions that are provided by the VEA. 


4.3.1 Processor Control Instructions — VEA 

The VEA defines the mftb instruction (user-level instruction) for reading the contents of 
the time base register; see Chapter 5, “Cache Model and Memory Coherency,” for more 
information. Table 4-28 describes the mftb instruction. 

Simplified mnemonics are provided (See Section F.8, “Simplified Mnemonics for Special- 
Purpose Registers”) for the mftb instruction so it can be coded with the TBR name as part 
of the mnemonic rather than requiring it to be coded as an operand. The simplified 
mnemonics Move from Time Base (mftb) and Move from Time Base Upper (mftbu) are 
variants of the mftb instruction rather than of the mfspr instruction. The mftb instruction 
serves as both a basic and simplified mnemonic. Assemblers recognize an mftb mnemonic 
with two operands as the basic form, and an mftb mnemonic with one operand as the 
simplified form. 
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Tt is not possible to read the entire 64-bit time base register in a single instruction. The mftb 
simplified mnemonic moves from the lower half of the time base register (TBL) to a GPR, 
and the mftbu simplified mnemonic moves from the upper half of the time base (TBU) to 
a GPR. 


Table 4-28. Move from Time Base Instruction 


Name 

Mnemonic 

Operand Syntax 

Operation 

Move 

from 

Time 

Base 

mftb 

rD, TBR 

The TBR field denotes either time base lower or time base upper, encoded 
as shown in Table 4-29 and Table 4-30. The contents of the designated 
register are copied to rD. 


Table 4-29 summarizes the time base (TBL/TBU) register encodings to which user-level 
access (using mftb) is permitted (as specified by the YEA). 


Table 4-29. User-Level TBR Encodings (VEA) 


Decimal Value 
in TBR Field 

tbr[0— 4] tbr[5— 9] 

Register 

Name 

Description 

268 

01100 01000 

TBL 

Time base lower (read-only) 

269 

01101 01000 

TBU 

Time base upper (read-only) 


Table 4-30 summarizes the TBL and TBU register encodings to which supervisor-level 
access (using mtspr) is permitted. 


Table 4-30. Supervisor-Level TBR Encodings (VEA) 


Decimal Value in 
SPR Field 

spr[0-4] spr[5-9] 

Register Name 

Description 

284 

11100 01000 

TBL 1 

Time base lower (write only) 

285 

11101 01000 

TBU 1 

Time base upper (write only) 


'Moving from the time base (TBL and TBU) can also be accomplished with the mftb instruction. 


4.3.2 Memory Synchronization Instructions — VEA 

Memory synchronization instructions control the order in which memory operations are 
completed with respect to asynchronous events, and the order in which memory operations 
are seen by other processors or memory access mechanisms. See Chapter 5, “Cache Model 
and Memory Coherency,” for additional information about these instructions and about 
related aspects of memory synchronization. 

System designs that use a second-level cache should take special care to recognize the 
hardware signaling caused by a sync operation and perform the appropriate actions to 
guarantee that memory references that may be queued internally to the second-level cache 
have been performed globally. 
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In addition to the sync instruction (specified by UISA), the VEA defines the Enforce In- 
Order Execution of I/O (eieio) and Instruction Synchronize (isync) instructions; see 
Table 4-3 1 . The number of cycles required to complete an eieio instruction depends on 
system parameters and on the processor's state when the instruction is issued. As a result, 
frequent use of this instruction may degrade performance slightly. 

The isync instruction causes the processor to wait for any preceding instructions to 
complete, discard all prefetched instructions, and then branch to the next sequential 
instruction after isync (which has the effect of clearing the pipeline of prefetched 
instructions). 


Table 4-31 Memory Synchronization Instructions — VEA 


Name 

Mnemonic 

Enforce In-Order 

eieio 

Execution of I/O 


Instruction 

isync 

Synchronize 




Operation 

The eieio instruction provides an ordering function for the effects of loads 
and stores executed by a processor. 

Executing an isync instruction ensures that all previous instructions 
complete before the isync instruction completes, although memory 
accesses caused by those instructions need not have been performed 
with respect to other processors and mechanisms. It also ensures that the 
processor initiates no subsequent instructions until the isync instruction 
completes. Finally, it causes the processor to discard any prefetched 
instructions, so subsequent instructions will be fetched and executed in 
the context established by the instructions preceding the isync 
instruction. 

This instruction does not affect other processors or their caches. 



4.3.3 Memory Control Instructions — VEA 

Memory control instructions include the following types: 

• Cache management instructions (user-level and supervisor-level) 

• Segment register manipulation instructions 

• Translation lookaside buffer management instructions 

This section describes the user-level cache management instructions defined by the VEA. 
See Section 4.4.3, “Memory Control Instructions — OEA,” for more information about 
supervisor-level cache, segment register manipulation, and translation lookaside buffer 
management instructions. 

4.3.3. 1 User-Level Cache Instructions — VEA 

The instructions summarized in this section provide user-level programs the ability to 
manage on-chip caches if they are implemented. See Chapter 5, “Cache Model and 
Memory Coherency,” for more information about cache topics. 

As with other memory-related instructions, the effect of the cache management instructions 
on memory are weakly ordered. If the programmer needs to ensure that cache or other 
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instructions have been performed with respect to all other processors and system 
mechanisms, a sync instruction must be placed in the program following those instructions. 

NOTE: When data address translation is disabled (MSRfDR] = 0), the Data Cache Block 
Clear to Zero (dcbz) and the Data Cache Block Allocate (dcba) instructions 
allocate a cache block in the cache and may not verify that the physical address 
(referred to as real address in the architecture specification) is valid. If a cache 
block is created for an invalid physical address, a machine check condition may 
result when an attempt is made to write that cache block back to memory. The 
cache block could be written back as a result of the execution of an instruction 
that causes a cache miss and the invalid addressed cache block is the target for 
replacement or a Data Cache Block Store (dcbst) instruction. 

Any cache control instruction that generates an effective address that corresponds to a 
direct-store segment (segment descriptor[T] = 1) is treated as a no-op. 

NOTE: The direct-store facility is being phased out of the architecture and will not likely 
be supported for future processors. 

Table 4-32 summarizes the cache instructions defined by the VEA. 

NOTE: These instructions are accessible to user-level programs. 


Table 4-32. User-Level Cache Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Data 

Cache 

Block 

Touch 

debt 

rA,rB 

The EA is the sum (rA 0) + (rB). 

This instruction is a hint that performance will probably be improved if the block 
containing the byte addressed by EA is fetched into the data cache, because 
the program will probably soon load from the addressed byte. 

Data 

Cache 

Block 

Touch for 
Store 

debtst 

rA,rB 

The EA is the sum (rA 0) + (rB). 

This instruction is a hint that performance will probably be improved if the block 
containing the byte addressed by EA is fetched into the data cache, because 
the program will probably soon store into the addressed byte. 
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Table 4-32. User-Level Cache Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Data 

Cache 

Block 

Allocate 

dcba 

rA,rB 

The EA is the sum (rA 0) + (rB). 

If the cache block containing the byte addressed by the EA is in the data cache, 
all bytes of the cache block are made undefined, but the cache block is still 
considered valid. 

Note: Programming errors can occur if the data in this cache block is 
subsequently read or used inadvertently. 

If the page containing the byte addressed by the EA is not in the data cache and 
the corresponding page is marked caching allowed (1 = 0), the cache block is 
allocated (and made valid) in the data cache without fetching the block from 
main memory, and the value of all bytes of the cache block is undefined. 

If the page containing the byte addressed by the EA is marked caching inhibited 
(WIM = xlx), this instruction is treated as a no-op. 

If the cache block addressed by the EA is located in a page marked as memory 
coherent (WIM = xxl ) and the cache block exists in the caches of other 
processors, memory coherence is maintained in those caches. 

The dcba instruction is treated as a store to the addressed byte with respect to 
address translation, memory protection, referenced and changed recording, 
and the ordering enforced by eieio or by the combination of caching-inhibited 
and guarded attributes for a page. 

This instruction is optional in the PowerPC architecture. 

(In the PowerPC OEA, the dcba instruction is additionally defined to clear all 
bytes of a newly established block to zero in the case that the block did not 
already exist in the cache.) 

Data 

Cache 

Block Clear 
to Zero 

dcbz 

rA,rB 

The EA is the sum (rA 0) + (rB). 

If the cache block containing the byte addressed by the EA is in the data cache, 
all bytes of the cache block are cleared to zero. 

If the page containing the byte addressed by the EA is not in the data cache and 
the corresponding page is marked caching allowed (1 = 0), the cache block is 
established in the data cache without fetching the block from main memory, and 
all bytes of the cache block are cleared to zero. 

If the page containing the byte addressed by the EA is marked caching inhibited 
(WIM = xlx) or write-through (WIM = Ixx), either all bytes of the area of main 
memory that corresponds to the addressed cache block are cleared to zero, or 
an alignment exception occurs. 

If the cache block addressed by the EA is located in a page marked as memory 
coherent (WIM = xxl ) and the cache block exists in the caches of other 
processors, memory coherence is maintained in those caches. 

The dcbz instruction is treated as a store to the addressed byte with respect to 
address translation, memory protection, referenced and changed recording, 
and the ordering enforced by eieio or by the combination of caching-inhibited 
and guarded attributes for a page. 

















Table 4-32. User-Level Cache Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Data 

Cache 

Block Store 

debst 

rA,rB 

The EA is the sum(rA 0) + (rB). 

If the cache block containing the byte addressed by the EA is located in a page 
marked memory coherent (WIM = xxl), and a cache block containing the byte 
addressed by EA is in the data cache of any processor and has been modified, 
the cache block is written to main memory.(Note: The architecture does not 
stipulate that the modified status of the block be cleared, that decision is left to 
the processor designer. Either action is logically correct.) 

If the cache block containing the byte addressed by the EA is located in a page 
not marked memory coherent (WIM = xxO), and a cache block containing the 
byte addressed by EA is in the data cache of this processor and has been 
modified, the cache block is written to main memory. (See note above.) 

The function of this instruction is independent of the write-through/write-back 
and caching-inhibited/caching-allowed modes of the cache block containing the 
byte addressed by the EA. 

The debst instruction is treated as a load from the addressed byte with respect 
to address translation and memory protection. It may also be treated as a load 
for referenced and changed bit recording except that referenced and changed 
bit recording may not occur. 

Data 

Cache 

Block Flush 

debt 

rA,rB 

The EA is the sum (rA 0) + (rB). 

The action taken depends on the memory mode associated with the target, and 
on the state of the block. The following list describes the action taken for the 
various cases, regardless of whether the page or block containing the 
addressed byte is designated as write-through or if it is in the caching-inhibited 
or caching-allowed mode. 

• Coherency required (WIM = xxl) 

— Unmodified block — Invalidates copies of the block in the caches of all 
processors. 

— Modified block — Copies the block to memory. Invalidates the copy of the 
block in the cache where it is found. There should only be one modified 
block. 

— Absent block — If a modified copy of the block is in the cache of another 
processor, causes it to be copied to memory and invalidated. If 
unmodified copies are in the caches of other processors, causes those 
copies to be invalidated. 

• Coherency not required (WIM = xxO) 

— Unmodified block — Invalidates the block in the processor’s cache. 

— Modified block — Copies the block to memory. Invalidates the block in the 
processor’s cache. 

— Absent block — Does nothing. 

The function of this instruction is independent of the write-through/write-back 
and caching-inhibited/caching-allowed modes of the cache block containing the 
byte addressed by the EA. 


The debt instruction is treated as a load from the addressed byte with respect 
to address translation and memory protection. It may also be treated as a load 
for referenced and changed bit recording except that referenced and changed 
bit recording may not occur. 

















Table 4-32. User-Level Cache Instructions (Continued) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Instruction 

icbi 

rA,rB 

The EA is the sum (rA 0) + (rB). 

Cache 

Block 

Invalidate 



If the cache block containing the byte addressed by EA is located in a page 
marked memory coherent (WIM = xxl), and a cache block containing the byte 
addressed by EA is in the instruction cache of any processor, the cache block is 
made invalid in all such instruction caches, so that the next reference causes 
the cache block to be refetched. 




If the cache block containing the byte addressed by EA is located in a page not 
marked memory coherent (WIM = xxO), and a cache block containing the byte 
addressed by EA is in the instruction cache of this processor, the cache block is 
made invalid in that instruction cache, so that the next reference causes the 
cache block to be refetched. 




The function of this instruction is independent of the write-through/write-back 
and caching-inhibited/caching-allowed modes of the cache block containing the 
byte addressed by the EA. 


The icbi instruction is treated as a load from the addressed byte with respect to 
address translation and memory protection. It may also be treated as a load for 
referenced and changed bit recording except that referenced and changed bit 
recording may not occur. 













4.3.4 External Control Instructions 

The external control instructions allow a user-level program to communicate with a special- 
purpose device. Two instructions are provided and are summarized in Table 4-33. 

Table 4-33. External Control Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

External 
Control In 
Word 
Indexed 

eciwx 

rD,rA,rB 

The EA is the sum (rA 0) + (rB). 

A load word request for the physical address corresponding to the EA is sent to 
the device identified by the EAR[RID] (bits 26-31), bypassing the cache. The 
word returned by the device is placed into rD. The EA sent to the device must be 
word-aligned. 

This instruction is treated as a load from the addressed byte with respect to 
address translation, memory protection, referenced and changed recording, and 
the ordering performed by eieio. 

This instruction is optional. 

External 

Control 

Out Word 
Indexed 

ecowx 

rS,rA,rB 

The EA is the sum (rA 0) + (rB). 

A store word request for the physical address corresponding to the EA and the 
contents of rS are sent to the device identified by EAR[RID] (bits 26-31), 
bypassing the cache. The EA sent to the device must be word-aligned. 

This instruction is treated as a store to the addressed byte with respect to 
address translation, memory protection, referenced and changed recording, and 
the ordering performed by eieio. Software synchronization is required in order to 
ensure that the data access is performed in program order with respect to data 
accesses caused by other store or ecowx instructions, even though the 
addressed byte is assumed to be caching-inhibited and guarded. 

This instruction is optional. 

















4.4 PowerPC OEA Instructions 

— The PowerPC operating environment architecture (OEA) includes the structure of the 
'■ ; memory management model, supervisor-level registers, and the exception model. 
V Implementations that conform to the OEA also adhere to the UISA and the VEA. This 
O section describes the instructions provided by the OEA. 


O 


4.4.1 System Linkage Instructions — OEA 

This section describes the system linkage instructions (see Table 4-34). The sc instruction 
is a user-level instruction that permits a user program to call on the system to perform a 
service and causes the processor to take an exception. The rfi instructions are supervisor- 
level instructions that are useful for returning from an exception handler. 


Table 4-34. System Linkage Instructions — OEA 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

System Call 

sc 


When executed, the effective address of the instruction following the sc 
instruction is placed into SRRO. Bits 1-4, and 10-15 of SRR1 are 
cleared. Additionally, bits 16-23, 25-27, and 30-31 of the MSR are 
placed into the corresponding bits of SRR1 . Depending on the 
implementation, additional bits of MSR may also be saved in SRR1 . 

Then a system call exception is generated. The exception causes the 
MSR to be altered as described in Section 6.4, “Exception Definitions.” 

The exception causes the next instruction to be fetched from offset 

OxCOO from the base physical address indicated by the old setting of 
MSR[IP], 

This instruction is context synchronizing. 

Return from 
Interrupt 

rfi 


Bits 16-23, 25-27, and 30-31 of SRR1 are placed into the 
corresponding bits of the MSR. Depending on the implementation, 
additional bits of MSR may also be restored from SRR1 . If the new MSR 
value does not enable any pending exceptions, the next instruction is 
fetched, under control of the new MSR value, from the address 
SRR0[0-29] || ObOO. 

If the new MSR value enables one or more pending exceptions, the 
exception associated with the highest priority pending exception is 
generated. At this time SRRO and SRR1 are left with their current values; 
the MSR is loaded with new values as determined by the exception and 
the processor branches to the exception handler to resolve the pending 
exception. 

This is a supervisor-level instruction and is context-synchronizing. 


4.4.2 Processor Control Instructions — OEA 

This section describes the processor control instructions that are used to read from and 
write to the MSR and the SPRs. 
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4.4.2. 1 Move to/from Machine State Register Instructions 

Table 4-35 summarizes the instructions used for reading from and writing to the MSR. 


Table 4-35. Move to/from Machine State Register Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Move to Machine 
State Register 

mtmsr 

rS 

The contents of rS are placed into the MSR. 

This instruction is a supervisor-level instruction and is context 
synchronizing except with respect to alterations to the POW and LE 
bits. Refer to Section 2.3.17, “Synchronization Requirements for 
Special Registers and for Lookaside Buffers,” for more information. 

Move from Machine 
State Register 

mfmsr 

rD 

The contents of the MSR are placed into rD. This is a supervisor-level 
instruction. 


4. 4. 2. 2 Move to/from Special-Purpose Register Instructions (OEA) 

Provided is a brief description of the mtspr and mfspr instructions (see Table 4-36). For 
more detailed information, see Chapter 8, “Instruction set.” Simplified mnemonics are 
provided for the mtspr and mfspr instructions in Appendix F, “Simplified Mnemonics.” 
For a discussion of context synchronization requirements when altering certain SPRs, refer 
to Appendix E, “Synchronization Programming Examples.” 


Table 4-36. Move to/from Special-Purpose Register Instructions (OEA) 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Move to 
Special- 
Purpose 
Register 

mtspr 

SPR,rS 

The SPR field denotes a special-purpose register. The contents of rS 
are placed into the designated SPR. 

For this instruction, SPRs TBL and TBU are treated as separate 32- 
bit registers; setting one leaves the other unaltered. 

Move from 
Special- 
Purpose 
Register 

mfspr 

rD,SPR 

The SPR field denotes a special-purpose register. The contents of 
the designated SPR are placed into rD. 


For mtspr and mfspr instructions, the SPR number coded in assembly language does not 
appear directly as a 10-bit binary number in the instruction. The number coded is split into 
two 5-bit halves that are reversed in the instruction encoding, with the high-order 5 bits 
appearing in bits 16-20 of the instruction encoding and the low-order 5 bits in bits 1 1-15. 

For information on SPR encodings (both user- and supervisor-level), see Chapter 8, 
“Instruction Set.” 

NOTE: There are additional SPRs specific to each implementation; for implementation- 
specific SPRs, see the user’s manual for your particular processor. 
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4.4.3 Memory Control Instructions — OEA 

Memory control instructions include the following types of instructions: 

• Cache management instructions (supervisor-level and user-level) 

• Segment register manipulation instructions 

• Translation lookaside buffer management instructions 

This section describes supervisor-level memory control instructions. See Section 4.3.3, 
“Memory Control Instructions — VEA,” for more information about user-level cache 
management instructions. 

4.4.3. 1 Supervisor-Level Cache Management Instruction 

Table 4-37 summarizes the operation of the only supervisor-level cache management 
instruction. See Section 4.3. 3.1, “User-Level Cache Instructions — VEA,” for cache 
instructions that provide user-level programs the ability to manage the on-chip caches. 

NOTE: Any cache control instruction that generates an effective address that 

corresponds to a direct-store segment (segment descriptor[T] = 1) is treated as a 
no-op.. 
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Table 4-37. Cache Management Supervisor-Level Instruction 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Data 

Cache 

Block 

Invalidate 

dcbi 

rA,rB 

The EA is the sum (rA|0) + (rB). 

The action taken depends on the memory mode associated with the target, and 
the state (modified, unmodified) of the cache block. The following list describes 
the action to take if the cache block containing the byte addressed by the EA is or 
is not in the cache. 

• Coherency required (WIM = xxl ) 

— Unmodified cache block — Invalidates copies of the cache block in the 
caches of all processors. 

— Modified cache block — Invalidates the copy of the cache block in the 
cache of the processor where the block is found, (there can only be one 
modified block). The modified contents are discarded. 

— Absent cache block — If copies are in the caches of any other processor, 
causes the copies to be invalidated. (Discards any modified contents.) 

• Coherency not required (WIM = xxO) 

— Unmodified cache block — Invalidates the cache block in the local cache. 

— Modified cache block — Invalidates the cache block in the local cache. 
(Discards the modified contents.) 

— Absent cache block — No action is taken. 

When data address translation is enabled, MSR[DT]=1 , and the logical (effective) 
address has no translation, a data access exception occurs. 

The function of this instruction is independent of the write-through and cache- 
inhibited/allowed modes determined by the WIM bit settings of the block 
containing the byte addressed by the EA. 

This instruction is treated as a store to the addressed byte with respect to 
address translation and protection, except that the change bit need not be set, 
and if the change bit is not set then the reference bit need not be set. 


4. 4. 3. 2 Segment Register Manipulation Instructions 

The instructions listed in Table 4-38 provide access to the segment registers segments 0 
through 15. These instructions operate completely independently of the MSRfIR] and 
MSR[DR] bit settings. Refer to Section 2.3.17, “Synchronization Requirements for Special 
Registers and for Lookaside Buffers,” for serialization requirements and other 
recommended precautions to observe when manipulating the segment registers. 


Chapter 4. Addressing Modes and Instruction Set Summary 


4-67 












Table 4-38. Segment Register Manipulation Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

Move to Segment 
Register 

mtsr 

SR,rS 

The contents of rS are placed into segment register specified by 
operand SR. 

This is a supervisor-level instruction. 

Move to Segment 
Register Indirect 

mtsrin 

rS,rB 

The contents of rS are copied to the segment register selected by bits 
0-3 of rB. 

This is a supervisor-level instruction. 

Move from Segment 
Register 

mfsr 

rD,SR 

The contents of the segment register specified by operand SR are 
placed into rD. 

This is a supervisor-level instruction. 

Move from Segment 
Register Indirect 

mfsrin 

rD,rB 

The contents of the segment register selected by bits 0-3 of rB are 
copied into rD. 

This is a supervisor-level instruction. 


4. 4. 3. 3 Translation Lookaside Buffer Management Instructions 

The address translation mechanism is defined in terms of segment descriptors and page 
table entries (PTEs) used by PowerPC processors to locate the logical-to-physical address 
mapping for a particular access. These segment descriptors and PTEs reside in segment 
registers and page tables in memory, respectively. 

For performance reasons, many processors implement one or more translation lookaside 
buffers on-chip. These are buffers (caches) that cache a portion of the page frame table. As 
changes are made to the address translation tables, it is necessary to maintain coherency 
between the TLB and the updated tables. This is done by invalidating TLB entries, or 
occasionally by invalidating the entire TLB, and allowing the translation caching 
mechanism to refetch from the tables. 

Each PowerPC implementation that has a TLB provides means for invalidating an 
individual TLB entry and/or invalidating the entire TLB. 
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Refer to Chapter 7, “Memory Management,” for more information about TLB operation. 
Table 4-39 summarizes the operation of the SLB and TLB instructions. 


Table 4-39. Translation Lookaside Buffer Management Instructions 


Name 

Mnemonic 

Operand 

Syntax 

Operation 

TLB 

Invalidate 

Entry 

tlbie 

rB 

The EA is the contents of rB. If the TLB contains an entry corresponding to the 
EA, that entry is removed from the TLB. The TLB search is performed 
regardless of the settings of MSR[IR] and MSR[DR]. Block address translation 
for the EA, if any, is ignored. 

This instruction causes the target TLB entry to be invalidated in all processors. 

The operation performed by this instruction is treated as a caching inhibited 
and guarded data access with respect to the ordering performed by eieio. 

This is a supervisor-level instruction and optional in the PowerPC architecture. 

TLB 

Invalidate All 

tibia 


All TLB entries are made invalid. The TLB is invalidated regardless of the 
settings of MSR[IR] and MSR[DR]. 

This instruction does not cause the entries to be invalidated in other 
processors. 

This is a supervisor-level instruction and optional in the PowerPC architecture. 

TLB 

Synchronize 

tlbsync 


Executing a tlbsync instruction ensures that all tlbie instructions previously 
executed by the processor executing the tlbsync instruction have completed 
on all processors. 

The operation performed by this instruction is treated as a caching inhibited 
and guarded data access with respect to the ordering performed by eieio. 

This is a supervisor-level instruction and optional in the PowerPC architecture. 


Because the presence and exact semantics of the translation lookaside buffer management 
instructions is implementation-dependent, system software should incorporate uses of the 
instruction into subroutines to minimize compatibility problems. 
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Chapter 5. Cache Model and Memory 
Coherency 

This chapter summarizes the cache model as defined by the virtual environment h 
architecture (VEA) as well as the built-in architectural controls for maintaining memory H 
coherency. This chapter describes the cache control instructions and special concerns for 
memory coherency in single-processor and multiprocessor systems. Aspects of the PjJ 
operating environment architecture (OEA) as they relate to the cache model and memory 
coherency are also covered. ' 

The PowerPC architecture provides for relaxed memory coherency. Features such as write- 
back caching and out-of-order execution allow software engineers to exploit the 
performance benefits of weakly-ordered memory access. The architecture also provides the 
means to control the order of accesses for order-critical operations. 

In this chapter, the term multiprocessor is used in the context of maintaining cache 
coherency. In this context, a system could include other devices that access system memory, 
maintain independent caches, and function as bus masters. 

Each cache management instruction operates on an aligned unit of memory. The VEA 
defines this cacheable unit as a block. Since the term ‘block’ is easily confused with the unit 
of memory addressed by the block address translation (BAT) mechanism, this chapter uses 
the term ‘cache block’ to indicate the cacheable unit. The size of the cache block can vary 
by instruction and by implementation. In addition, the unit of memory at which coherency 
is maintained is called the coherence block. The size of the coherence block is also 
implementation- specific. However, the coherence block is often the same size as the cache 
block. 

5.1 The Virtual Environment 

The user instruction set architecture (UISA) relies upon a memory space of 2 32 bytes for 
applications. The VEA expands upon the memory model by introducing virtual memory, 
caches, and shared memory multiprocessing. Although many applications will not need to r- 
access the features introduced by the VEA, it is important that programmers are aware that L 
they are working in a virtual environment where the physical memory may be shared by 
multiple processes running on one or more processors. 
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This section describes load and store ordering, atomicity, the cache model, memory 
coherency, and the VEA cache management instructions. The features of the VEA are 
accessible to both user-level and supervisor-level applications (referred to as problem state 
and privileged state, respectively, in the architecture specification). 

The mechanism for controlling the virtual memory space is defined by the OEA. The 
features of the OEA are accessible to supervisor-level applications only (typically operating 
systems). For more information on the address translation mechanism, refer to Chapter 7, 
“Memory Management.” 

5.1.1 Memory Access Ordering 

The VEA specifies a weakly consistent memory model for shared memory multiprocessor 
systems. This model provides an opportunity for significantly improved performance over 
a model that has stronger consistency rules, but places the responsibility for access ordering 
on the programmer. When a program requires strict access ordering for proper execution, 
the programmer must insert the appropriate ordering or synchronization instructions into 
the program. 

The order in which the processor performs memory accesses, the order in which those 
accesses complete in memory, and the order in which those accesses are viewed as 
occurring by another processor may all be different. A means of enforcing memory access 
ordering is provided to allow programs (or instances of programs) to share memory. Similar 
means are needed to allow programs executing on a processor to share memory with some 
other mechanism, such as an I/O device, that can also access memory. 

Various facilities are provided that enable programs to control the order in which memory 
accesses are performed by separate instructions. First, if separate store instructions access 
memory that is designated as both caching-inhibited and guarded, the accesses are 
performed in the order specified by the program. Refer to Section 5.1.4, “Memory 
Coherency,” and Section 5.2.1, “Memory/Cache Access Attributes,” for a complete 
description of the caching-inhibited and guarded attributes. Additionally, two instructions, 
eieio and sync, are provided that enable the program to control the order in which the 
memory accesses caused by separate instructions are performed. 

No ordering should be assumed among the memory accesses caused by a single instruction 
(that is, by an instruction for which multiple accesses are not atomic), and no means are 
provided for controlling that order. Chapter 4, “Addressing Modes and Instruction Set 
Summary,” contains additional information about the sync and eieio instructions. 

5. 1.1.1 Enforce In-Order Execution of I/O Instruction 

The eieio instruction permits the program to control the order in which loads and stores are 
performed when the accessed memory has certain attributes, as described in Chapter 8, 
“Instruction Set.” For example, eieio can be used to ensure that a sequence of load and store 
operations to an I/O device’s control registers updates those registers in the desired order. 
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The eieio instruction can also be used to ensure that all stores to a shared data structure are 
visible to other processors before the store that releases the lock is visible to them. 

The eieio instruction may complete before memory accesses caused by instructions 
preceding the eieio instruction have been performed with respect to system memory or 
coherent storage as appropriate. 

If stronger ordering is desired, the sync instruction must be used. 

5. 1.1. 2 Synchronize instruction 

When a portion of memory that requires coherency must be forced to a known state, it is 
necessary to synchronize memory with respect to other processors and mechanisms. This 
synchronization is accomplished by requiring programs to indicate explicitly in the 
instruction stream, by inserting a sync instruction, that synchronization is required. Only 
when sync completes are the effects of all coherent memory accesses previously executed 
by the program guaranteed to have been performed with respect to all other processors and 
mechanisms that access those locations coherently. 

The sync instruction ensures that all the coherent memory accesses, initiated by a program, 
have been performed with respect to all other processors and mechanisms that access the 
target locations coherently, before its next instruction is executed. A program can use this 
instruction to ensure that all updates to a shared data structure, accessed coherently, are 
visible to all other processors that access the data structure coherently, before executing a 
store that will release a lock on that data structure. Execution of the sync instruction does 
the following: 

• Performs the functions described for the sync instruction in Section 4.2.6, “Memory 
Synchronization Instructions — UISA.” 

• Ensures that consistency operations, and the effects of icbi, dcbz, dcbst, dcbf, dcba, 
and dcbi instructions previously executed by the processor executing sync, have 
completed on such other processors as the memory/cache access attributes of the 
target locations require. 

• Ensures that TLB invalidate operations previously executed by the processor 
executing the sync have completed on that processor. The sync instruction does not 
wait for such invalidates to complete on other processors. 

• Ensures that memory accesses due to instructions previously executed by the 
processor executing the sync are recorded in the R and C bits in the page table and 
that the new values of those bits are visible to all processors and mechanisms; refer 
to Section 7.5.3, “Page History Recording.” 

The sync instruction is execution synchronizing. It is not context synchronizing, and 
therefore need not discard prefetched instructions. 
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For memory that does not require coherency, the sync instruction operates as described 
above except that its only effect on memory operations is to ensure that all previous 
memory operations have completed, with respect to the processor executing the sync 
instruction, to the level of memory specified by the memory/cache access attributes 
(including the updating of R and C bits). 

5.1.2 Atomicity 

An access is atomic if it is always performed in its entirety with no visible fragmentation. 
Atomic accesses are thus serialized — each happens in its entirety in some order, even when 
that order is neither specified in the program nor enforced between processors. 

Only the following single-register accesses are guaranteed to be atomic: 

• Byte accesses (all bytes are aligned on byte boundaries) 

• Half-word accesses aligned on half-word boundaries 

• Word accesses aligned on word boundaries 


No other accesses are guaranteed to be atomic. In particular, the accesses caused by the 
following instructions are not guaranteed to be atomic: 

• Load and store instructions with misaligned operands 

• lmw, stmw, lswi, lswx, stswi, or stswx instructions 

• Floating-point double-word accesses 

• Any cache management instructions 

The lwarx/stwcx. instruction combination can be used to perform atomic memory 
references. The lwarx instruction is a load from a word-aligned location that has two side 
effects: 

1 . A reservation for a subsequent stwcx. instruction is created. 

2. The memory coherence mechanism is notified that a reservation exists for the 
memory location accessed by the lwarx. 

The stwcx. instruction is a store to a word-aligned location that is conditioned on the 
existence of the reservation created by lwarx and on whether the same memory location is 
specified by both instructions and whether the instructions are issued by the same 
processor. 

NOTE: When a reservation is made to a word in memory by the lwarx instruction, an 
address is saved and a reservation is set. Both of these are necessary for the 
memory coherence mechanism, however, some processors do not implement the 
address compare for the stwcx. instruction. Only the reservation need be 
established in order for the stwcx. to be successful. This requires that exception 
handlers clear reservations if control is passed to another program. Programmers 
should read the specifications for each individual processor. 
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In a multiprocessor system, every processor (other than the one executing lwarx/stwcx.) 
that might update the location must configure the addressed page as memory coherency 
required. The lwarx/stwcx. instructions function in caching-inhibited, as well as in 
caching-allowed, memory. If the addressed memory is in write-through mode, it is 
implementation-dependent whether these instructions function correctly or cause the DSI 
exception handler to be invoked. 

NOTE: Exceptions are referred to as interrupts in the architecture specification. 

The lwarx/stwcx. instruction combination is described in Section 4.2.6, “Memory 
Synchronization Instructions — UISA,” and Chapter 8, “Instruction Set.” 

5.1.3 Cache Model 

The PowerPC architecture does not specify the type, organization, implementation, or even 
the existence of a cache. The standard cache model has separate instruction and data caches, 
also known as a Harvard cache model. However, the architecture allows for many different 
cache types. Some implementations will have a unified cache (where there is a single cache 
for both instructions and data). Other implementations may not have a cache at all. 

The function of the cache management instructions depends on the implementation of the 
cache(s) and the setting of the memory/cache access modes. For a program to execute 
properly on all implementations, software should use the Harvard model. In cases where a 
processor is implemented without a cache, the architecture guarantees that instructions 
affecting the nonimplemented cache will not halt execution. 

NOTE: dcbz may cause an alignment exception on some implementations. For example, 
a processor with no cache may treat a cache instruction as a no-op. Or, a 
processor with a unified cache may treat the icbi instruction as a no-op. In this 
manner, programs written for separate instruction and data caches will run on all 
compliant implementations. 

5.1.4 Memory Coherency 

The primary objective of a coherent memory system is to provide the same image of 
memory to all devices using the system. The VEA and OEA define coherency controls that 
facilitate synchronization, cooperative use of shared resources, and task migration among 
processors. These controls include the memory/cache access attributes, the sync and eieio 
instructions, and the lwarx/stwcx. instruction pair. Without these controls, the processor 
could not support a weakly-ordered memory access model. 

A strongly-ordered memory access model hinders performance by requiring excessive 
overhead, particularly in multiprocessor environments. For example, a processor 
performing a store operation in a strongly-ordered system requires exclusive access to an 
address before making an update, to prevent another device from using stale data. 
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The VEA defines a page as a unit of memory for which protection and control attributes are 
independently specifiable. The OEA (supervisor level) specifies the size of a page as 
4 Kbytes. 

NOTE: The VEA (user level) does not specify the page size. 

5. 1.4.1 Memory/Cache Access Modes 

The OEA defines the set of memory/cache access modes and the mechanism to implement 
these modes. Refer to Section 5.2.1, “Memory/Cache Access Attributes,” for more 
information. However, the VEA specifies that at the user level, the operating system can be 
expected to provide the following attributes for each page of memory: 

• Write-through or write -back 

• Caching-inhibited or caching-allowed 

• Memory coherency required or memory coherency not required 

• Guarded or not guarded 

User-level programs specify the memory/cache access attributes through an operating 
system service. 

5. 1.4. 1.1 Pages Designated as Write-Through 

When a page is designated as write-through, store operations update the data in the cache 
and also update the data in main memory. The processor writes to the cache and through to 
main memory. Load operations use the data in the cache, if it is present. 

In write-back mode, the processor is only required to update data in the cache. The 
processor may (but is not required to) update main memory. Load and store operations use 
the data in the cache, if it is present. The data in main memory does not necessarily stay 
consistent with that same location’s data in the cache. Many implementations automatically 
update main memory in response to a memory access by another device (for example, a 
snoop hit). In addition, the dcbst and dcbf instructions can be used to explicitly force an 
update of main memory. 

The write-through attribute is meaningless for locations designated as caching-inhibited. 

5. 1.4. 1.2 Pages Designated as Caching-Inhibited 

When a page is designated as caching-inhibited, the processor bypasses the cache and 
performs load and store operations to main memory. When a page is designated as caching- 
allowed, the processor uses the cache and performs load and store operations to the cache 
or main memory depending on the other memory/cache access attributes for the page. 

It is important that all locations in a page are purged from the cache prior to changing the 
memory/cache access attribute for the page from caching-allowed to caching-inhibited. It 
is considered a programming error if a caching-inhibited memory location is found in the 
cache. Software must ensure that the location has not previously been brought into the 
cache, or, if it has, that it has been flushed from the cache. If the programming error occurs, 
the result of the access is boundedly undefined. 
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5. 1.4. 1.3 Pages Designated as Memory Coherency Required 

When a page is designated as memory coherency required, store operations to that location 
are serialized with all stores to that same location by all other processors that also access 
the location coherently. This can be implemented, for example, by an ownership protocol 
that allows at most one processor at a time to store to the location. Moreover, the current 
copy of a cache block that is in this mode may be copied to main storage any number of 
times, for example, by successive dcbst instructions. 

Coherency does not ensure that the result of a store by one processor is visible immediately 
to all other processors and mechanisms. Only after a program has executed the sync 
instruction are the previous storage accesses it executed guaranteed to have been performed 
with respect to all other processors and mechanisms. 

5. 1.4. 1.4 Pages Designated as Memory Coherency Not Required 

For a memory area that is configured such that coherency is not required, software must 
ensure that the data cache is consistent with main storage before changing the mode or 
allowing another device to access the area. 

Executing a dcbst or dcbf instruction specifying a cache block that is in this mode causes 
the block to be copied to main memory if and only if the processor modified the contents 
of a location in the block and the modified contents have not been written to main memory. 

In a single-cache system, correct coherent execution may likely not require memory 
coherency; therefore, using memory coherency not required mode improves performance. 

5.1. 4.1. 5 Pages Designated as Guarded 

The guarded attribute pertains to out-of-order execution. Refer to Section 5.2. 1.5. 3, “Out- 
of-Order Accesses to Guarded Memory,” for more information about out-of-order 
execution. 

When a page is designated as guarded, instructions and data cannot be accessed out of 
order. Additionally, if separate store instructions access memory that is both caching- 
inhibited and guarded, the accesses are performed in the order specified by the program. 
When a page is designated as not guarded, out-of-order fetches and accesses are allowed. 

Guarded pages are traditionally used for memory-mapped I/O devices. 

5. 1.4. 2 Coherency Precautions 

Mismatched memory/cache attributes cause coherency paradoxes in both single-processor 
and multiprocessor systems. When the memory/cache access attributes are changed, it is 
critical that the cache contents reflect the new attribute settings. For example, if a block or 
page that had allowed caching becomes caching-inhibited, the appropriate cache blocks 
should be flushed to leave no indication that caching had previously been allowed. 

Although coherency paradoxes are considered programming errors, specific 
implementations may attempt to handle the offending conditions and minimize the negative 
effects on memory coherency. Bus operations that are generated for specific instructions 
and state conditions are not defined by the architecture. 
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5.1.5 VEA Cache Management Instructions 

The VEA defines instructions for controlling both the instruction and data caches. For 
implementations that have a unified instruction/data cache, instruction cache control 
instructions are valid instructions, but may function differently. 

NOTE: Any cache control instruction that generates an EA that corresponds to a direct- 
store segment (SR[T] = 1) is treated as a no-op. However, the direct-store facility 
is being phased out of the architecture and will not likely be supported in future 
devices. Thus, software should not depend on its effects. 

This section briefly describes the cache management instructions available to programs at 
the user privilege level. Additional descriptions of coding the VEA cache management 
instructions is provided in Chapter 4, “Addressing Modes and Instruction Set Summary,” 
and Chapter 8, “Instruction Set.” In the following instruction descriptions, the target is the 
cache block containing the byte addressed by the effective address. 

5. 1.5.1 Data Cache Instructions 

Data caches and unified caches must be consistent with other caches (data or unified), 
memory, and I/O data transfers. To ensure consistency, aliased effective addresses (two 
effective addresses that map to the same physical address) must have the same page offset. 

NOTE: Physical address is referred to as real address in the architecture specification. 

5. 1.5. 1.1 Data Cache Block Touch (debt) and 

Data Cache Block Touch for Store (debtst) Instructions 

These instructions provide a method for improving performance through the use of 
software-initiated prefetch hints. However, these instructions do not guarantee that a cache 
block will be fetched. 

A program uses the debt instruction to request a cache block fetch before it is needed by 
the program. The program can then use the data from the cache rather than fetching from 
main memory. 

The debtst instruction behaves similarly to the debt instruction. A program uses debtst to 
request a cache block fetch to guarantee that a subsequent store will be to a cached location. 

The processor does not invoke the exception handler for translation or protection violations 
caused by either of the touch instructions. Additionally, memory accesses caused by these 
instructions are not necessarily recorded in the page tables. If an access is recorded, then it 
is treated in a manner similar to that of a load from the addressed byte. Some 
implementations may not take any action based on the execution of these instructions, or 
they may prefetch the cache block corresponding to the EA into their cache. For 
information about the R and C bits, see Section 7.5.3, “Page History Recording.” 

Both debt and debtst are provided for performance optimization. These instructions do not 
affect the correct execution of a program, regardless of whether they succeed (fetch the 
cache block) or fail (do not fetch the cache block). If the target block is not accessible to 
the program for loads, then no operation occurs. 
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5. 1.5. 1.2 Data Cache Block Set to Zero (dcbz) Instruction 

The dcbz instruction clears a single cache block as follows: 

• If the target is in the data cache, all bytes of the cache block are cleared. 

• If the target is not in the data cache and the corresponding page is caching-allowed, 
the cache block is established in the data cache (without fetching the cache block 
from main memory), and all bytes of the cache block are cleared. 

• If the target is designated as either caching-inhibited or write-through, then either all 
bytes in main memory that correspond to the addressed cache block are cleared, or 
the alignment exception handler is invoked. The exception handler should clear all 
the bytes in main memory that correspond to the addressed cache block. 

• If the target is designated as coherency required, and the cache block exists in the 
data cache(s) of any other processor(s), it is kept coherent in those caches. 

The dcbz instruction is treated as a store to the addressed byte with respect to address 
translation, protection, referenced and changed recording, and the ordering enforced by 
eieio or by the combination of caching-inhibited and guarded attributes for a page. 

Refer to Chapter 6, “Exceptions,” for more information about a possible delayed machine 
check exception that can occur by using dcbz when the operating system has set up an 
incorrect memory mapping. 

5. 1.5.1. 3 Data Cache Block Store (dcbst) Instruction 

The dcbst instruction permits the program to ensure that the latest version of the target 
cache block is in main memory. The dcbst instruction executes as follows: 

• Coherency required — If the target exists in the data cache of any processor and has 
been modified, the data is written to main memory. Only one processor in a 
multiprocessor system should have possession of a modified cache block. 

• Coherency not required — If the target exists in the data cache of the executing 
processor and has been modified, the data is written to main memory. 

The PowerPC architecture does not specify whether the modified status of the cache block 
is left unchanged or is cleared (cleared implies valid-shared or valid-exclusive). That 
decision is left to the implementation of individual processors. Either state is logically 
correct. 

The function of this instruction is independent of the write-through/write-back and 
caching-inhibited/caching-allowed attributes of the target. 

The memory access caused by a dcbst instruction is not necessarily recorded in the page 
tables. If the access is recorded, then it is treated as a load operation (not as a store 
operation). 
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5. 1.5. 1.4 Data Cache Block Flush (debt) Instruction 

The action taken depends on the memory/cache access mode associated with the target, and 
on the state of the cache block. The following list describes the action taken for the various 
cases: 

• Coherency required 

Unmodified cache block — Invalidates copies of the cache block in the data caches 
of all processors. 

Modified cache block — Copies the cache block to memory. Invalidates the copy of 
the cache block in the data cache of any processor where it is found. There should 
only be one modified cache block in a coherency required multiprocessor system. 

Target block not in cache — If a modified copy of the cache block is in the data cache 
of another processor, debf causes the modified cache block to be copied to memory 
and then invalidated. If unmodified copies are in the data caches of other processors, 
debf causes those copies to be invalidated. 

• Coherency not required 

Unmodified cache block — Invalidates the cache block in the executing processor's 
data cache. 

Modified cache block — Copies the data cache block to memory and then invalidates 
the cache block in the executing processor. 

Target block not in cache — No action is taken. 

The function of this instruction is independent of the write-through/write-back and 
caching-inhibited/caching-allowed attributes of the target. 

The memory access caused by a debf instruction is not necessarily recorded in the page 
tables. If the access is recorded, then it is treated as a load operation (not as a store 
operation). 

5. 1.5. 2 Instruction-Cache Instructions 

Instruction caches, if they exist, are not required to be consistent with data caches, memory, 
or I/O data transfers. Software must use the appropriate cache management instructions to 
ensure that instruction caches are kept coherent when instructions are modified by the 
processor or by input data transfer. When a processor alters a memory location that may be 
contained in an instruction cache, software must ensure that updates to memory are visible 
to the instruction fetching mechanism. Although the instructions to enforce consistency 
vary among implementations, the following sequence for a uniprocessor system is typical: 

1 . debst (update memory) 

2. sync (wait for update) 

3. iebi (invalidate copy in instruction cache) 

4. isync (perform context synchronization) 
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NOTE: Most operating systems will provide a system service for this function. These 
operations are necessary because the memory may be designated as write-back. 
Since instruction fetching may bypass the data cache, changes made to items in 
the data cache may not otherwise be reflected in memory until after the 
instruction fetch completes. 

For implementations used in multiprocessor systems, variations on this sequence may be 
recommended. For example, in a multiprocessor system with a unified instruction/data 
cache (at any level), if instructions are fetched without coherency being enforced, the 
preceding instruction sequence is inadequate. Because the icbi instruction does not 
invalidate blocks in a unified cache, a dcbf instruction should be used instead of a dcbst 
instruction for this case. 

5. 1.5. 2.1 Instruction Cache Block Invalidate (icbi) Instruction 

The icbi instruction executes as follows: 

• Coherency required 

If the target is in the instruction cache of any processor, the cache block is made 
invalid in all such processors, so that the next reference causes the cache block to be 
refetched. 

• Coherency not required 

If the target is in the instruction cache of the executing processor, the cache block is 
made invalid in the executing processor so that the next reference causes the cache 
block to be refetched. 

The icbi instruction is provided for use in processors with separate instruction and data 
caches. The effective address is computed, translated, and checked for protection violations 
as defined in Chapter 7, “Memory Management.” If the target block is not accessible to the 
program for loads, then a DSI exception occurs. 

The function of this instruction is independent of the write-through/write-back and 
caching-inhibited/caching-allowed attributes of the target. 

The memory access caused by an icbi instruction is not necessarily recorded in the page 
tables. If the access is recorded, then it is treated as a load operation. Implementations that 
have a unified cache treat the icbi instruction as a no-op except that they may invalidate the 
target cache block in the instruction caches of other processors (in coherency required 
mode). 
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5. 1.5. 2. 2 Instruction Synchronize (isync) Instruction 

The isync instruction provides an ordering function for the effects of all instructions 
executed by a processor. Executing an isync instruction ensures that all instructions 
preceding the isync instruction have completed before the isync instruction completes, 
except that memory accesses caused by those instructions need not have been performed 
with respect to other processors and mechanisms. It also ensures that no subsequent 
instructions are initiated by the processor until after the isync instruction completes. 
Finally, it causes the processor to discard any prefetched instructions, with the effect that 
subsequent instructions will be fetched and executed in the context established by the 
instructions preceding the isync instruction. The isync instruction has no effect on other 
processors or on their caches. 
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5.2 The Operating Environment 

The OEA defines the mechanism for controlling the memory/cache access modes 
introduced in Section 5. 1.4.1, “Memory/Cache Access Modes.” This section describes the 
cache-related aspects of the OEA including the memory/cache access attributes, out-of- 
order execution, direct-store interface considerations, and the dcbi instruction. The features 
of the OEA are accessible to supervisor-level applications only. The mechanism for 
controlling the virtual memory space is described in Chapter 7, “Memory Management.” 

The memory model of PowerPC processors provides the following features: 

• Flexibility to allow performance benefits of weakly-ordered memory access 

• A mechanism to maintain memory coherency among processors and between a 
processor and I/O devices controlled at the block and page level 

• Instructions that can be used to ensure a consistent memory state 

• Guaranteed processor access order 

The memory implementations in PowerPC systems can take advantage of the performance 
benefits of weak ordering of memory accesses between processors or between processors 
and other external devices without any additional complications. Memory coherency can 
be enforced externally by a snooping bus design, a centralized cache directory design, or 
other designs that can take advantage of the coherency features of PowerPC processors. 

Memory accesses performed by a single processor appear to complete sequentially from 
the view of the programming model but may complete out of order with respect to the 
ultimate destination in the memory hierarchy. Order is guaranteed at each level of the 
memory hierarchy for accesses to the same address from the same processor. The dcbst, 
dcbf, icbi, isync, sync, eieio, lwarx, and stwcx. instructions allow the programmer to 
ensure a consistent memory state. 
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5.2.1 Memory/Cache Access Attributes 

All instruction and data accesses are performed under the control of the four memory/cache 
access attributes: 

• Write-through (W attribute) 

• Caching-inhibited (I attribute) 

• Memory coherency (M attribute) 

• Guarded (G attribute) 

These attributes are maintained in the PTEs and BATs by the operating system for each 
page and block respectively. The W and I attributes control how the processor performing 
an access uses its own cache. The M attribute ensures that coherency is maintained for all 
copies of the addressed memory location. When an access requires coherency, the 
processor performing the access must inform the coherency mechanisms throughout the 
system that the access requires memory coherency. The G attribute prevents out-of-order 
loading and prefetching from the addressed memory location. 

NOTE: The memory/cache access attributes are relevant only when an effective address 
is translated by the processor performing the access. Also not all combinations 
of settings of these bits is supported. The attributes are not saved along with data 
in the cache (for cacheable accesses), nor are they associated with subsequent 
accesses made by other processors. 

The operating system maintains the memory/cache access attribute for each page or block 
as required. The WIMG attributes occupy four bits in the BAT registers for block address 
translation and in the PTEs for page address translation. The WIMG bits are defined as 
follows: 

• The operating system uses the mtspr instruction to store the WIMG bits in the BAT 
registers for block address translation. The IBAT register pairs implement the W or 
G bits; however, attempting to set either bit in IBAT registers causes boundedly- 
undefined results. 

• The operating system stores the WIMG bits for each page into the PTEs in system 
memory as it sets up the page tables. 

NOTE: For data accesses performed in real addressing mode (MSR[DR] = 0), the 

WIMG bits are assumed to be ObOOl 1 (the data is write -back, caching is enabled, 
memory coherency is enforced, and memory is guarded). For instruction 
accesses performed in real addressing mode (MSR[IR] = 0), the WIMG bits are 
assumed to be ObOOOl (the data is write-back, caching is enabled, memory 
coherency is not enforced, and memory is guarded). 


Chapter 5. Cache Model and Memory Coherency 


5-13 



5. 2. 1.1 Write-Through Attribute (W) 

When an access is designated as write-through (W = 1), if the data is in the cache, a store 
operation updates the cached copy of the data. In addition, the update is written to the 
memory location. The definition of the memory location to be written to (in addition to the 
cache) depends on the implementation of the memory system but can be illustrated by the 
following examples: 

• RAM — The store is sent to the RAM controller to be written into the target RAM. 

• I/O device — The store is sent to the memory-mapped I/O controller to be written to 
the target register or memory location. 

In systems with multilevel caching, the store must be written to at least a depth in the 
memory hierarchy that is seen by all processors and devices. 

Multiple store instructions may be combined for write-through accesses except when the 
store instructions are separated by a sync or eieio instruction. A store operation to a 
memory location designated as write-through may cause any part of the cache block to be 
written back to main memory. 

Accesses that correspond to W = 0 are considered write-back. For this case, although the 
store operation is performed to the cache, the data is copied to memory only when a copy- 
back operation is required. Use of the write-back mode (W = 0) can improve overall 
performance for areas of the memory space that are seldom referenced by other processors 
or devices in the system. 

Accesses to the same memory location using two effective addresses for which the W bit 
setting differs meet the memory-coherency requirements if the accesses are performed by 
a single processor. If the accesses are performed by two or more processors, coherence is 
enforced by the hardware only if the write-through attribute is the same for all the accesses. 

5. 2. 1.2 Caching-Inhibited Attribute (I) 

If I = 1, the memory access is completed by referencing the location in main memory, 
bypassing the cache. During the access, the addressed location is not loaded into the cache 
nor is the location allocated in the cache. 

It is considered a programming error if a copy of the target location of an access to caching- 
inhibited memory is resident in the cache. Software must ensure that the location has not 
been previously loaded into the cache, or, if it has, that it has been flushed from the cache. 

Data accesses from more than one instruction may be combined for cache-inhibited 
operations, except when the accesses are separated by a sync instruction, or by an eieio 
instruction when the page or block is also designated as guarded. 

Instruction fetches, dcbz instructions, and load and store operations to the same memory 
location using two effective addresses for which the I bit setting differs must meet the 
requirement that a copy of the target location of an access to caching-inhibited memory not 
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be in the cache. Violation of this requirement is considered a programming error; software 
must ensure that the location has not previously been brought into the cache or, if it has, 
that it has been flushed from the cache. If the programming error occurs, the result of the 
access is boundedly undefined. It is not considered a programming error if the target 
location of any other cache management instruction to caching-inhibited memory is in the 
cache. 

5. 2. 1.3 Memory Coherency Attribute (M) 

This attribute is provided to allow improved performance in systems where hardware- 
enforced coherency is relatively slow, and software is able to enforce the required 
coherency. When M = 0, there are no requirements to enforce data coherency. When M = 1, 
the processor enforces data coherency. 

When the M attribute is set, and the access is performed to memory, there is a hardware 
indication to the rest of the system that the access is global. Other processors affected by 
the access must then respond to this global access. For example, in a snooping bus design, 
the processor may assert some type of global access signal. Other processors affected by 
the access respond and signal whether the data is being shared. If the data in another 
processor is modified, then the location is updated and the access is retried. 

Because instruction memory does not have to be coherent with data memory, some 
implementations may ignore the M attribute for instruction accesses. In a single-processor 
(or single-cache) system, performance might be improved by designating all pages as 
memory coherency not required. 

Accesses to the same memory location using two effective addresses for which the M bit 
settings differ may require explicit software synchronization before accessing the location 
with M = 1 if the location has previously been accessed with M = 0. Any such requirement 
is system-dependent. For example, no software synchronization may be required for 
systems that use bus snooping. In some directory-based systems, software may be required 
to execute dcbf instructions on each processor to flush all storage locations accessed with 
M = 0 before accessing those locations with M = 1 . 

5.2.1 .4 W, I, and M Bit Combinations 

Table 5-1 summarizes the six combinations of the WIM bits supported by the OEA. The 
combinations where WIM = llx are not supported. 

NOTE; Either a zero or one setting for the G bit is allowed for each of these WIM bit 
combinations. 


Table 5-1. Combinations of W, I, and M Bits 


WIM Setting 

Meaning 

000 

The processor may cache data (or instructions). 

A load or store operation whose target hits in the cache may use that entry in the cache. 

The processor does not need to enforce memory coherency for accesses it initiates. 
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Table 5-1. Combinations of W, I, and M Bits (Continued) 


WIM Setting 

Meaning 

001 

Data (or instructions) may be cached. 

A load or store operation whose target hits in the cache may use that entry in the cache. 

The processor enforces memory coherency for accesses it initiates. 

010 

Caching is inhibited. 

The access is performed to memory, completely bypassing the cache. 

The processor does not need to enforce memory coherency for accesses it initiates. 

Oil 

Caching is inhibited. 

The access is performed to memory, completely bypassing the cache. 

The processor enforces memory coherency for accesses it initiates. 

100 

Data (or instructions) may be cached. 

A load operation whose target hits in the cache may use that entry in the cache. 

Store operations are written to memory. The target location of the store may be cached and is 
updated on a hit. 

The processor does not need to enforce memory coherency for accesses it initiates. 

101 

Data (or instructions) may be cached. 

A load operation whose target hits in the cache may use that entry in the cache. 

Store operations are written to memory. The target location of the store may be cached and is 
updated on a hit. 

The processor enforces memory coherency for accesses it initiates. 


5. 2. 1.5 The Guarded Attribute (G) 

When the guarded bit is set, the memory area (block or page) is designated as guarded. This 
setting can be used to protect certain memory areas from read accesses made by the 
processor that are not dictated directly by the program. If there are areas of physical 
memory that are not fully populated (in other words, there are holes in the physical memory 
map within this area), this setting can protect the system from undesired accesses caused 
by out-of-order load operations or instruction prefetches that could lead to the generation 
of the machine check exception. Also, the guarded bit can be used to prevent out-of-order 
(speculative) load operations or prefetches from occurring to certain peripheral devices that 
produce undesired results when accessed in this way. 

5. 2. 1.5.1 Performing Operations Out of Order 

An operation is said to be performed in-order if it is guaranteed to be required by the 
sequential execution model. Any other operation is said to be performed out of order. 

Operations are performed out of order by the hardware on the expectation that the results 
will be needed by an instruction that will be required by the sequential execution model. 
Whether the results are really needed is contingent on everything that might divert the 
control flow away from the instruction, such as branch, trap, system call, and rfi 
instructions, and exceptions, and on everything that might change the context in which the 
instruction is executed. 

Typically, the hardware performs operations out of order when it has resources that would 
otherwise be idle, so the operation incurs little or no cost. If subsequent events such as 
branches or exceptions indicate that the operation would not have been performed in the 
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sequential execution model, the processor abandons any results of the operation (except as 
described below). 

Most operations can be performed out of order, as long as the machine appears to follow 
the sequential execution model. Certain out-of-order operations are restricted, as follows. 

• Stores 

A store instruction may not be executed out of order in a manner such that the 
alteration of the target location can be observed by other processors or mechanisms. 

• Accessing guarded memory 

The restrictions for this case are given in Section 5. 2. 1.5. 3, “Out-of-Order Accesses 
to Guarded Memory.” 

No error of any kind other than a machine check exception may be reported due to an 
operation that is performed out of order, until such time as it is known that the operation is 
required by the sequential execution model. The only other permitted side effects (other 
than machine check) of performing an operation out of order are the following: 

• Referenced and changed bits may be set as described in Section 7.2.5, “Page History 
Information.” 

• Nonguarded memory locations that could be fetched into a cache by in-order 
execution may be fetched out of order into that cache. 

5. 2. 1.5. 2 Guarded Memory 

Memory is said to be well behaved if the corresponding physical memory exists and is not 
defective, and if the effects of a single access to it are indistinguishable from the effects of 
multiple identical accesses to it. Data and instructions can be fetched out of order from 
well-behaved memory without causing undesired side effects. 

Memory is said to be guarded if either (a) the G bit is 1 in the relevant PTE or DBAT 
register, or (b) the processor is in real addressing mode (MSR[IR] = 0 or MSR[DR] = 0 for 
instruction fetches or data accesses respectively). In case (b), all of memory is guarded for 
the corresponding accesses. In general, memory that is not well-behaved should be 
guarded. Because such memory may represent an I/O device or may include locations that 
do not exist, an out-of-order access to such memory may cause an I/O device to perform 
incorrect operations or may result in a machine check. 

NOTE: If separate store instructions access memory that is both caching-inhibited and 
guarded, the accesses are performed in the order specified by the program. If an 
aligned, elementary load or store to caching-inhibited, guarded memory has 
accessed main memory and an external, decrementer, or imprecise-mode 
floating-point enabled exception is pending, the load or store is completed before 
the exception is taken. 
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5. 2. 1.5.3 Out-of-Order Accesses to Guarded Memory 

The circumstances in which guarded memory may be accessed out of order are as follows: 

• Load instruction 

If a copy of the target location is in a cache, the location may be accessed in the 
cache or in main memory. 

• Instruction fetch 

In real addressing mode (MSR[IR] = 0), an instruction may be fetched if any of the 
following conditions is met: 

— The instruction is in a cache. In this case, it may be fetched from that cache. 

— The instruction is in the same physical page as an instruction that is required by 
the sequential execution model or is in the physical page immediately following 
such a page. 

If MSRfIR] = 1, instructions may not be fetched from either no-execute segments or 
guarded memory. If the effective address of the current instruction is mapped to 
either of these kinds of memory when MSR[IR] = 1, an ISI exception is generated. 
However, it is permissible for an instruction from either of these kinds of memory 
to be in the instruction cache if it was fetched into that cache when its effective 
address was mapped to some other kind of memory. Thus, for example, the 
operating system can access an application's instruction segments as no-execute 
without having to invalidate them in the instruction cache. 

Additionally, instructions are not fetched from direct-store segments (only applies 
when MSR[IR] = 1). If an instruction fetch is attempted from a direct-store segment, 
an ISI exception is generated. 

NOTE: The direct-store facility is being phased out of the architecture and will not likely 
be supported in future devices. Thus, software should not depend on its effects. 

Software should ensure that only well-behaved memory is loaded into a cache, either by 
marking as caching-inhibited (and guarded) all memory that may not be well-behaved, or 
by marking such memory caching-allowed (and guarded) and referring only to cache 
blocks that are well-behaved. 

If a physical page contains instructions that will be executed in real addressing mode 
(MSR[IR] = 0), software should ensure that this physical page and the next physical page 
contain only well-behaved memory. 
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5.2.2 I/O Interface Considerations 

The PowerPC architecture defines two mechanisms for accessing I/O: 

• Memory-mapped I/O interface operations where SR[T] = 0. These operations are 
considered to address memory space and are therefore subject to the same coherency 
control as memory accesses. Depending on the specific I/O interface, the 
memory/cache access attributes (WIMG) and the degree of access ordering 
(requiring eieio or sync instructions) need to be considered. This is the 
recommended way of accessing I/O. 

• Direct-store segment operations where SR[T] = 1. These operations are considered 
to address the noncoherent and noncacheable direct-store segment space; therefore, 
hardware need not maintain coherency for these operations, and the cache is 
bypassed completely. Although the architecture defines this direct-store 
functionality, it is being phased out of the architecture and will not likely be 
supported in future devices. Thus, its use is discouraged, and new software should 
not use it or depend on its effects. 

5.2.3 OEA Cache Management Instruction — 

Data Cache Block Invalidate (dcbi) 

As described in Section 5.1.5, “VEA Cache Management Instructions,” the VEA defines 
instructions for controlling both the instruction and data caches, The OEA defines one 
instruction, the data cache block invalidate (dcbi) instruction, for controlling the data 
cache. This section briefly describes the cache management instruction available to 
programs at the supervisor privilege level. Additional descriptions of coding the dcbi 
instruction are provided in Chapter 4, “Addressing Modes and Instruction Set Summary,” 
and Chapter 8, “Instruction Set.” In the following description, the target is the cache block 
containing the byte addressed by the effective address. 

Any cache management instruction that generates an EA that corresponds to a direct-store 
segment (SR[T] = 1) is treated as a no-op. 

NOTE: The direct-store facility is being phased out of the architecture and will not likely 
be supported in future devices. Thus, software should not depend on its effects. 

The action taken depends on the memory/cache access mode associated with the target, and 
on the state of the cache block. The following list describes the action taken for the various 
cases: 

• Coherency required 

Unmodified cache block — Invalidates copies of the cache block in the data caches 
of all processors. 

Modified cache block — Invalidates the copy of the cache block in the data cache of 
the processor where it is found. (Discards the modified data in the cache block.) 
There can only be one modified cache block in a coherency required system. 
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Target block not in cache — If copies of the target are in the data caches of other 
processors, dcbi causes those copies to be invalidated, regardless of whether the data 
is modified (see modified cache block above) or unmodified. 

• Coherency not required 

Unmodified cache block — Invalidates the cache block in the executing processor's 
data cache. 

Modified cache block — Invalidates the cache block in the executing processor's data 
cache. (Discards the modified data in the cache block.) 

Target block not in cache — No action is taken. 

The processor treats the dcbi instruction as a store to the addressed byte with respect to 
address translation and protection. It is not necessary to set the referenced and changed bits. 

The function of this instruction is independent of the write-through/write-back and 
caching-inhibited/caching-allowed attributes of the target. To ensure coherency, aliased 
effective addresses (two effective addresses that map to the same physical address) must 
have the same page offset. 
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Chapter 6. Exceptions 


The operating environment architecture (OEA) portion of the PowerPC architecture defines 
the mechanism by which PowerPC processors implement exceptions (referred to as 
interrupts in the architecture specification). Exception conditions may be defined at other 
levels of the architecture. For example, the user instruction set architecture (UISA) defines 
conditions that may cause floating-point exceptions; the OEA defines the mechanism by 
which the exception is taken. 



The PowerPC exception mechanism allows the processor to change to supervisor state as a 
result of external signals, errors, or unusual conditions arising in the execution of 
instructions. When exceptions occur, information about the state of the processor is saved 
to certain registers and the processor begins execution at an address (exception vector) 
predetermined for each exception. Processing of exceptions begins in supervisor mode. 


Although multiple exception conditions can map to a single exception vector, a more 
specific condition may be determined by examining a register associated with the 
exception — for example, the DSISR and the floating-point status and control register 
(FPSCR). Additionally, certain exception conditions can be explicitly enabled or disabled 
by software. 

The PowerPC architecture requires that exceptions be taken in program order; therefore, 
although a particular implementation may recognize exception conditions out of order, they 
are handled strictly in order with respect to the instruction stream. When an instruction- 
caused exception is recognized, any unexecuted instructions that appear earlier in the 
instruction stream, including any that have not yet entered the execute state, are required to 
complete before the exception is taken. For example, if a single instruction encounters 
multiple exception conditions, those exceptions are taken and handled sequentially. 
Likewise, exceptions that are asynchronous and precise are recognized when they occur, 
but are not handled until all instructions currently in the execute stage successfully 
complete execution and report their results. 

NOTE: Exceptions can occur while an exception handler routine is executing, and 

multiple exceptions can become nested. It is up to the exception handler to save 
the appropriate machine state if it is desired to allow control to ultimately return 
to the excepting program. 

In many cases, after the exception handler handles an exception, there is an attempt to 
execute the instruction that caused the exception. Instruction execution continues until the 
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next exception condition is encountered. This method of recognizing and handling 
exception conditions sequentially guarantees that the machine state is recoverable and 
processing can resume without losing instruction results. 

To prevent the loss of state information, exception handlers must save the information 
stored in SRRO and SRR1 soon after the exception is taken to prevent this information from 
being lost due to another exception being taken. 


In this chapter, the following terminology is used to describe the various stages of exception 
processing: 


Recognition 

Taken 


Handling 


Exception recognition occurs when the condition that can cause an 
exception is identified by the processor. 

An exception is said to be taken when control of instruction 
execution is passed to the exception handler; that is, the context is 
saved and the instruction at the appropriate vector offset is fetched 
and the exception handler routine is begun in supervisor mode. 

Exception handling is performed by the software linked to the 
appropriate vector offset. Exception handling is begun in supervisor 
mode (referred to as privileged state in the architecture 
specification). 
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6.1 Exception Classes 

As specified by the PowerPC architecture, all exceptions can be described as either precise 
or imprecise and either synchronous or asynchronous. Asynchronous exceptions are caused 
by events external to the processor’s execution; synchronous exceptions are caused by 
instructions. 

The PowerPC exception types are shown in Table 6-1. 

Table 6-1. PowerPC Exception Classifications 


Type 

Exception 

Asynchronous/nonmaskable 

Machine Check 

System Reset 

Asynchronous/maskable 

External interrupt 

Decrementer 

Synchronous/Precise 

Instruction-caused exceptions, excluding floating- 
point imprecise exceptions 

Synchronous/Imprecise 

Instruction-caused imprecise exceptions 
(Floating-point imprecise exceptions) 


Exceptions, their offsets, and conditions that cause them, are summarized in Table 6-2. The 
exception vectors described in the table correspond to physical address locations, 
depending on the value of MSRflP]. Refer to Section 7. 2. 1.2, “Predefined Physical 
Memory Locations,” for a complete list of the predefined physical memory areas. 
Remaining sections in this chapter provide more complete descriptions of the exceptions 
and of the conditions that cause them. 
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Table 6-2. Exceptions and Conditions — Overview 


Exception 

Type 

Vector Offset 
(hex) 

Causing Conditions 

System reset 

00100 

The causes of system reset exceptions are implementation-dependent. If the 
conditions that cause the exception also cause the processor state to be corrupted 
such that the contents of SRR0 and SRR1 are no longer valid or such that other 
processor resources are so corrupted that the processor cannot reliably resume 
execution, the copy of the Rl bit copied from the MSR to SRR1 is cleared. 

Machine 

check 

00200 

The causes for machine check exceptions are implementation-dependent, but 
typically these causes are related to conditions such as bus parity errors or 
attempting to access an invalid physical address. Typically, these exceptions are 
triggered by an input signal to the processor. 

Note: Not all processors provide the same level of error checking. 

The machine check exception is disabled when MSR[ME] = 0. If a machine check 
exception condition exists and the ME bit is cleared, the processor goes into the 
checkstop state. 

If the conditions that cause the exception also cause the processor state to be 
corrupted such that the contents of SRR0 and SRR1 are no longer valid or such that 
other processor resources are so corrupted that the processor cannot reliably resume 
execution, the copy of the Rl bit written from the MSR to SRR1 is cleared. 

Note: Physical address is referred to as real address in the architecture specification. 

DSI 

00300 

A DSI exception occurs when a data memory access cannot be performed for any of 
the reasons described in Section 6.4.3, “DSI Exception (0x00300).” Such accesses 
can be generated by load/store instructions, certain memory control instructions, and 
certain cache control instructions. 

ISI 

00400 

An ISI exception occurs when an instruction fetch cannot be performed for a variety of 
reasons described in Section 6.4.4, “ISI Exception (0x00400).” 

External 

interrupt 

00500 

An external interrupt is generated only when an external interrupt is pending (typically 
signalled by a signal defined by the implementation) and the interrupt is enabled 
(MSR[EE] = 1). 

Alignment 

00600 

An alignment exception may occur when the processor cannot perform a memory 
access for reasons described in Section 6.4.6, “Alignment Exception (0x00600).” 

Note: An implementation is allowed to perform the operation correctly and not cause 
an alignment exception. 


























Table 6-2. Exceptions and Conditions — Overview (Continued) 


Exception 

Type 

Vector Offset 
(hex) 

Causing Conditions 

Program 

00700 

A program exception is caused by one of the following exception conditions, which 

correspond to bit settings in SRR1 and arise during execution of an instruction: 

• Floating-point enabled exception — A floating-point enabled exception condition is 
generated when MSR[FE0-FE1] 00 and FPSCR[FEX] is set. The settings of FE0 
and FE1 are described in Table 6-3. 

FPSCR[FEX] is set by the execution of a floating-point instruction that causes an 
enabled exception or by the execution of a Move to FPSCR instruction that sets 
both an exception condition bit and its corresponding enable bit in the FPSCR. 
These exceptions are described in Section 3.3.6, “Floating-Point Program 
Exceptions.” 

• Illegal instruction — An illegal instruction program exception is generated when 
execution of an instruction is attempted with an illegal opcode or illegal 
combination of opcode and extended opcode fields or when execution of an 
optional instruction not provided in the specific implementation is attempted (these 
do not include those optional instructions that are treated as no-ops). The 

PowerPC instruction set is described in Chapter 4, “Addressing Modes and 
Instruction Set Summary.” See Section 6.4.7, “Program Exception (0x00700),” for 
a complete list of causes for an illegal instruction program exception. 

• Privileged instruction — A privileged instruction type program exception is 
generated when the execution of a privileged instruction is attempted and the 

MSR user privilege bit, MSR[PR], is set. This exception is also generated for 
mtspr or mfspr with an invalid SPR field if spr[0] = 1 and MSR[PR] = 1 . 

• Trap — A trap type program exception is generated when any of the conditions 
specified in a trap instruction is met. 

For more information, refer to Section 6.4.7, “Program Exception (0x00700).” 

Floating- 

point 

unavailable 

00800 

A floating-point unavailable exception is caused by an attempt to execute a floating- 
point instruction (including floating-point load, store, and move instructions) when the 
floating-point available bit is cleared, MSR[FP] = 0. 

Decrementer 

00900 

The decrementer interrupt exception is taken if the exception is enabled (MSR[EE] = 
1), and it is pending. The exception is created when the most-significant bit of the 
decrementer changes from 0 to 1 . If it is not enabled, the exception remains pending 
until it is taken. 

Reserved 

00A00 

This is reserved for implementation-specific exceptions. For example, the 601 uses 
this vector offset for direct-store exceptions. 

Reserved 

00B00 

— 

System call 

oocoo 

A system call exception occurs when a System Call (sc) instruction is executed. 

Trace 

00D00 

Implementation of the trace exception is optional. If implemented, it occurs if either 
the MSR[SE] = 1 and almost any instruction successfully completed or MSR[BE] = 1 
and a branch instruction is completed. See Section 6.4.11, “Trace Exception 
(OxOODOO),” for more information. 

Floating- 
point assist 

00E00 

Implementation of the floating-point assist exception is optional. This exception can 
be used to provide software assistance for infrequent and complex floating-point 
operations such as denormalization. 

Reserved 

00E1 0-00FFF 

— 

Reserved 

01 000-02FFF 

This is reserved for implementation-specific purposes. May be used for 
implementation-specific exception vectors or other uses. 



































6.1.1 Precise Exceptions 

When any precise exceptions occur SRRO is set to point to the first instruction that has not 
completed execution and all prior instructions in the instruction stream have completed 
execution to a point where they cannot report exceptions. However, the instruction 
addressed by SRRO and those following it may have started execution (e.g. fetched, 
dispatched, decoded, etc.) but have not completed execution. 

When an exception occurs, instruction dispatch (the issuance of instructions by the 
instruction fetch unit to any instruction execution mechanism) is halted and the following 
synchronization is performed: 

1 . The exception mechanism waits for all previous instructions in the instruction 
stream to complete to a point where they will not report any exceptions. 

2. The processor ensures that all previous instructions in the instruction stream 
complete in the context in which they began execution. 

3. The exception mechanism implemented in hardware (the loading of registers SRRO 
and SRR1) and the software handler (saving SRRO and SRR1 in the stack and 
updating stack pointer, etc.) are responsible for saving and restoring the processor 
state. 

The synchronization described conforms to the requirements for context synchronization. 
A complete description of context synchronization is described in the following section. 

6.1.2 Synchronization 

The synchronization described in this section refers to the state of activities within the 
processor that performs the synchronization. 

6. 1.2.1 Context Synchronization 

An instruction or event is context synchronizing if it satisfies all the requirements listed 
below. Such instructions and events are collectively called context-synchronizing 
operations. Examples of context-synchronizing operations include the sc and 
rfiinstructions and most exceptions. A context-synchronizing operation has the following 
characteristics: 

1 . The operation causes instruction fetching and dispatching (the issuance of 
instructions by the instruction fetch mechanism to any instruction execution 
mechanism) to be halted. 

2. The operation is not initiated or, in the case of isync, does not complete, until all 
instructions in execution have completed to a point at which they have reported all 
exceptions they will cause. 

If a prior memory access instruction causes one or more direct-store interface error 
exceptions, the results are guaranteed to be determined before this instruction is 
executed. However, note that the direct-store facility is being phased out of the 
architecture and will not likely be supported in future devices. 
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3. Instructions that precede the operation complete execution in the context (for 
example, the privilege, translation mode, and memory protection) in which they 
were initiated. 

4. If the operation either directly causes an exception (for example, the sc instruction 
causes a system call exception) or is an exception, the operation is not initiated until 
no exception exists having higher priority than the exception associated with the 
context- synchronizing operation. 

A context- synchronizing operation is necessarily execution synchronizing. Unlike the sync 
instruction, a context-synchronizing operation need not wait for memory-related operations 
to complete on this or other processors, or for referenced and changed bits in the page table 
to be updated. 

6. 1.2. 2 Execution Synchronization 

An instruction is execution synchronizing if it satisfies the conditions of the first two items 
described above for context synchronization. The sync instruction is treated like isync with 
respect to the second item described above (that is, the conditions described in the second 
item apply to the completion of sync). The sync and mtmsr instructions are examples of 
execution- synchronizing instructions . 

All context-synchronizing instructions are execution-synchronizing. Unlike a context- 
synchronizing operation, an execution-synchronizing instruction need not ensure that the 
subsequent instructions execute in the context established by this and previous instructions. 
This new context becomes effective sometime after the execution-synchronizing 
instruction completes and before or at a subsequent context-synchronizing operation. 

6. 1.2. 3 Synchronous/Precise Exceptions 

When instruction execution causes a precise exception, the following conditions exist at the 
exception point: 

• SRRO always points to the instruction causing the exception except for the sc 
instruction. In this case SRRO points to the immediately following instruction. The 
instruction addressed can be determined from the exception type and status bits, 
which are defined in the description of each exception. In all cases SRRO points to 
the first instruction that has not completed execution. The sc instruction always 
completes execution, updates the instruction pointer and reports the exception. 
Hence, SRRO points to the instructions following sc. 

• All instructions that precede the excepting instruction complete to a point where 
they will not report exceptions before the exception is processed. However, some 
memory accesses generated by these preceding instructions may not have been 
performed with respect to all other processors or system devices. 

• The instruction causing the exception may not have begun execution, may have 
partially completed, or may have completed, depending on the exception type. 
Handling of partially executed instructions is described in Section 6.1.4, “Partially 
Executed Instructions.” 

• Architecturally, no subsequent instruction has completed execution. 
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While instruction parallelism allows the possibility of multiple instructions reporting 
exceptions during the same cycle, they are handled one at a time in program order. 
Exception priorities are described in Section 6.1.5, “Exception Priorities.” 

6. 1.2. 4 Asynchronous Exceptions 

There are four asynchronous exceptions — system reset and machine check, which are 
nonmaskable and highest-priority exceptions, and external interrupt and decrementer 
exceptions which are maskable and low-priority. These two types of asynchronous 
exceptions are discussed separately. 

6. 1.2. 4.1 System Reset and Machine Check Exceptions 

System reset and machine check exceptions have the highest priority and can occur while 
other exceptions are being processed. 

NOTE: Nonmaskable, asynchronous exceptions are never delayed; therefore, if two of 
these exceptions occur in immediate succession, the state information saved by 
the first exception may be overwritten when the subsequent exception occurs. 
Also, these exceptions are context-synchronizing if they are recoverable; the 
system uses the MSR[RI] to detect whether an exception is recoverable. 

While a system is running the MSR[RI] bit is set. When an exception occurs a copy of the 
MSR register is stored in SRR1. Then most bits in the MSR are clear including the RI bit 
with various exceptions (see the exceptions types for new setting of the MSR bits, e.g. IP 
is never cleared). The exception handler saves the state of the machine (saving SRRO and 
SRR1 into the stack and updating the stack pointer) to a point that it can incur another 
exception. At this point the exception handler sets the MSR[RI] bit. Also the external 
interrupt can be re-enabled. Now you can clearly understand that if the exception handler 
ever sees in the SRR1 register a case where the MSR[RI] bit is not set, the exception is not 
recoverable (because the exception occurred while the machine state was being saved) and 
a system restart procedure should be initiated. 

System reset and machine check exceptions cannot be masked by using the MSR[EE] bit. 
Furthermore, if the machine check enable bit, MSR[ME], is cleared and a machine check 
exception condition occurs, the processor goes directly into checkstop state as the result of 
the exception condition. Clearly, one never wants to run in this mode (MSR[ME] cleared) 
for extended periods of time. When one of these exceptions occur, the following conditions 
exist at the exception point: 

• For system reset exceptions, SRRO addresses the instruction that would have 
attempted to execute next if the exception had not occurred. 

• For machine check exceptions, SRRO holds either an instruction that would have 
completed or some instruction following it that would have completed if the 
exception had not occurred. 
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• An exception is generated such that all instructions preceding the instruction 
addressed by SRRO appear to have completed with respect to the executing 
processor. 

6. 1.2. 4. 2 External Interrupt and Decrementer Exceptions 

For the external interrupt and decrementer exceptions, the following conditions exist at the 
exception point (assuming these exceptions are enabled (MSR[EE] bit is set)): 

• All instructions issued before the exception is taken and any instructions that 
precede those instructions in the instruction stream appear to have completed before 
the exception is processed. 

• No subsequent instructions in the instruction stream have completed execution. 

• SRRO addresses the first instruction that has not completed execution. 

That is, these exceptions are context-synchronizing. The external interrupt and decrementer 
exceptions are maskable. When the machine state register external interrupt enable bit is 
cleared (MSR[EE] = 0), these exception conditions are not recognized until the EE bit is 
set. MSR[EE] is cleared automatically when an exception is taken, to delay recognition of 
subsequent exception conditions. No two precise exceptions can be recognized 
simultaneously. Exception handling does not begin until all currently executing instructions 
complete and any synchronous, precise exceptions caused by those instructions have been 
handled. Exception priorities are described in Section 6.1.5, “Exception Priorities.” 

6.1.3 Imprecise Exceptions 

The PowerPC architecture defines several imprecise exceptions. An imprecise exception is 
one where the instruction addressed by SRRO has nothing to do with the exception taking 
place. That is some instruction has been previously executed created a condition that is now 
causing an exception to take place. External and decrementer exceptions fit this description. 
A third class of instructions that cause imprecise exceptions is the imprecise floating-point 
enabled exception. This can be programmed as one of the conditions that can cause an 
imprecise exception. 

6. 1.3.1 Imprecise Exception Status Description 

When the execution of an instruction causes an imprecise exception, SRRO contains 
information related to the address of the excepting instruction as follows: 

• SRRO contains the address of an instruction that has nothing to do with the exception 
currently taking place. 

• The instruction addressed by SRRO and all subsequent instructions have not 
completed execution. 

• The exception is generated such that all instructions preceding the instruction 
addressed by SRRO have completed with respect to the processor. 
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6. 1.3. 2 Recoverability of Imprecise Floating-Point Exceptions 

The enabled IEEE floating-point exception mode bits in the MSR (FEO and FE1) together 
define whether IEEE floating-point exceptions are handled precisely, imprecisely, or 
whether they are taken at all. The possible settings are shown in Table 6-3. For further 
details, see Section 3.3.6, “Floating-Point Program Exceptions.” 

Table 6-3. IEEE Floating-Point Program Exception Mode Bits 


FEO 

FE1 

Mode 

0 

0 

Floating-point exceptions ignored 

0 

1 

Floating-point imprecise nonrecoverable 

1 

0 

Floating-point imprecise recoverable 

1 

1 

Floating-point precise mode 


As shown in the table, the imprecise floating-point enabled exception has two 
modes — nonrecoverable and recoverable. These modes are specified by setting the 
MSRfFEO] and MSRfFEl] bits and are described as follows: 

• Imprecise nonrecoverable floating-point enabled mode. MSRfFEO] = 0; 

MSR[FE1] = 1 . When an exception occurs, the exception handler is invoked at some 
point at or beyond the instruction that caused the exception. It may not be possible 
to identify the offending instruction or the data that caused the exception. Results 
from the offending instruction may have been used by or affected data of subsequent 
instructions executed before the exception handler was invoked. 

• Imprecise recoverable floating-point enabled mode. MSR[FE0] = 1; MSRfFEl] = 0. 
When an exception occurs, the floating-point enabled exception handler is invoked 
at some point at or beyond the offending instruction that caused the exception. 
Sufficient information is provided to the exception handler that it can identify the 
offending instruction and correct any faulty data. In this mode, no incorrect data 
caused by the offending instruction have been used by or affected data of subsequent 
instructions that are executed before the exception handler is invoked. 

Although these exceptions are maskable with these bits, they differ from other maskable 
exceptions in that the masking is usually controlled by the application program rather than 
by the operating system. 

(As of the date of this publication no PowerPC processor has implemented these two modes 
of floating-point exceptions and treats both of them as floating-point precise mode.) 
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6.1.4 Partially Executed Instructions 

The architecture permits certain instructions to be partially executed when an alignment 
exception or DSI exception occurs, or an imprecise floating-point exception is forced by an 
instruction that causes an alignment or DSI exception. They are as follows: 

• Load multiple/string instructions that cause an alignment or DSI exception — Some 
registers in the range of registers to be loaded may have been loaded. 

• Store multiple/string instructions that cause an alignment or DSI exception — Some 
bytes in the addressed memory range may have been updated. 

• Non-multiple/string store instructions that cause an alignment or DSI 
exception — Some bytes just before the boundary may have been updated. If the 
instruction normally alters CRO (stwcx.), CRO is set to an undefined value. For 
instructions that perform register updates, the update register (rA) is not altered. 

• Floating-point load instructions that cause an alignment or DSI exception — The 
target register may be altered. For update forms, the update register (rA) is not 
altered. 

• A load or store to a direct-store segment that causes a DSI exception due to a direct- 
store interface error exception — Some of the associated address/data transfers may 
not have been initiated. All initiated transfers are completed before the exception is 
reported, and the transfers that have not been initiated are aborted. Thus the 
instruction completes before the DSI exception occurs. However, note that the 
direct-store facility is being phased out of the architecture and will not likely be 
supported in future devices. 

In the cases above, the number of registers and the amount of memory altered are 
implementation-, instruction-, and boundary-dependent. However, memory protection is 
not violated. Furthermore, if some of the data accessed are in a direct-store segment and the 
instruction is not supported for use in such memory space, the locations in the direct-store 
segment are not accessed. Again, note that the direct-store facility is being phased out of 
the architecture and will not likely be supported in future devices. 

Partial execution is not allowed when integer load operations (except multiple/string 
operations) cause an alignment or DSI exception. The target register is not altered. For 
update forms of the integer load instructions, the update register (rA) is not altered. 
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6.1.5 Exception Priorities 

Exceptions are roughly prioritized by exception class, as follows: 

1 . Nonmaskable, asynchronous exceptions have priority over all other 
exceptions — system reset and machine check exceptions (although the machine 
check exception condition can be disabled so that the condition causes the processor 
to go directly into the checkstop state). These two types of exceptions in this class 
cannot be delayed by exceptions in other classes, and do not wait for the completion 
of any other exception handling. 

2. Synchronous, precise exceptions are caused by instructions and are taken in strict 
program order. 

3 . Maskable asynchronous exceptions (external interrupt and decrementer exceptions) 
have lowest priority. 

The exceptions are listed in Table 6-4 in order of highest to lowest priority. 

Table 6-4. Exception Priorities 


Exception 

Class 

Priority 

Exception 

Nonmaskable, 

asynchronous 

1 

System reset — The system reset exception has the highest priority of all exceptions. If this 
exception exists, the exception mechanism ignores all other exceptions and generates a 
system reset exception. When the system reset exception is generated, previously issued 
instructions can no longer generate exception conditions that cause a nonmaskable 
exception. 


2 

Machine check — The machine check exception is the second-highest priority exception. If 
this exception occurs, the exception mechanism ignores all other exceptions (except reset) 
and generates a machine check exception. When the machine check exception is 
generated, previously issued instructions can no longer generate exception conditions that 
cause a nonmaskable exception. 












Table 6-4. Exception Priorities (Continued) 


Exception 

Class 

Synchronous, 

precise 


Imprecise 


Priority Exception 

3 Instruction dependent — When an instruction causes an exception, the exception 
mechanism waits for any instructions prior to the offending instruction in the instruction 
stream to complete. Any exceptions caused by these instructions are handled first. It then 
generates the appropriate exception if no higher priority exception exists. 

Note:A single instruction can cause multiple exceptions. When this occurs, those 
exceptions are ordered in priority as indicated in the following: 

A. Integer loads and stores 

a. Alignment 

b. DSI 

c. Trace (if implemented) 

B. Floating-point loads and stores 

a. Floating-point unavailable 

b. Alignment 

c. DSI 

d. Trace (if implemented) 

C. Other floating-point instructions 

a. Floating-point unavailable 

b. Program — Precise-mode floating-point enabled exception 

c. Floating-point assist (if implemented) 

d. Trace (if implemented) 

D. and mtmsr 

a. Program — Privileged Instruction 

b. Program — Precise-mode floating-point enabled exception 

c. Trace (if implemented), for mtmsr only 

If precise-mode IEEE floating-point enabled exceptions are enabled and the 
FPSCR[FEX] bit is set, a program exception occurs no later than the next 
synchronizing event. 

E. Other instructions 

a. These exceptions are mutually exclusive and have the same priority: 

— Program: Trap 
— System call (sc) 

— Program: Privileged Instruction 
— Program: Illegal Instruction 

b. Trace (if implemented) 

F. ISI exception 

The ISI exception has the lowest priority in this category. It is only recognized when all 
instructions prior to the instruction causing this exception appear to have completed and 
that instruction is to be executed. The priority of this exception is specified for 
completeness and to ensure that it is not given more favorable treatment. An 
implementation can treat this exception as though it had a lower priority. 

4 Program imprecise floating-point mode enabled exceptions — When this exception occurs, 
the exception handler is invoked at or beyond the floating-point instruction that caused the 
exception. The PowerPC architecture supports recoverable and nonrecoverable imprecise 
modes, which are enabled by setting MSR[FE0] MSR[FE1]. For more information see, 
Section 6.1 .3, “Imprecise Exceptions.” 













Table 6-4. Exception Priorities (Continued) 


Exception 

Class 

Priority 

Exception 

Maskable, 

imprecise, 

asynchronous 

5 

External interrupt — The external interrupt mechanism waits for instructions currently or 
previously dispatched to complete execution. After all such instructions are completed, and 
any exceptions caused by those instructions have been handled, the exception mechanism 
generates this exception if no higher priority exception exists. This exception is enabled 
only if MSR[EE] is currently set. If EE is zero when the exception is detected, it is delayed 
until the bit is set. 

6 

Decrementer — This exception is the lowest priority exception. When this exception is 
created, the exception mechanism waits for all other possible exceptions to be reported. It 
then generates this exception if no higher priority exception exists. This exception is 
enabled only if MSR[EE] is currently set. If EE is zero when the exception is detected, it is 
delayed until the bit is set. 


Nonmaskable, asynchronous exceptions (namely, system reset or machine check 
exceptions) may occur at any time. That is, these exceptions are not delayed if another 
exception is being handled (although machine check exceptions can be delayed by system 
reset exceptions). As a result, state information for the interrupted exception handler may 
be lost. 

All other exceptions have lower priority than system reset and machine check exceptions, 
and the exception may not be taken immediately when it is recognized. Only one 
synchronous, precise exception can be reported at a time. If a maskable, asynchronous or 
an imprecise exception condition occurs while instruction-caused exceptions are being 
processed, its handling is delayed until all exceptions caused by previous instructions in the 
program flow are handled and those instructions complete execution. 

6.2 Exception Processing 

When an exception is taken, the processor uses the save/restore registers, SRR1 and SRRO, 
respectively, to save the contents of the MSR for the interrupted process and to help 
determine where instruction execution should resume after the exception is handled. 

When an exception occurs, the address saved in SRRO is used to help calculate where 
instruction processing should resume when the exception handler returns control to the 
interrupted process. Depending on the exception, this may be the address in SRRO or at the 
next address in the program flow. All instructions in the program flow preceding this one 
will have completed execution and no subsequent instruction will have completed 
execution. This may be the address of the instruction that caused the exception or the next 
one (as in the case of a system call or trap exception). The SRRO register is shown in 
Figure 6-1. 
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□ Reserved 


SRRO (holds EA for instruction in interrupted program flow) 


00 


o 


293031 


Figure 6-1. Machine Status Save/Restore Register 0 

The save/restore register 1 (SRR1) is used to save machine status (selected bits from the 
MSR and other implementation- specific status bits as well) on exceptions and to restore 
those values when is executed. SRR1 is shown in Figure 6-2. 


Exception-specific information and MSR bit values 

0 31 


Figure 6-2. Machine Status Save/Restore Register 1 

When an exception occurs, SRR1 1-4 and 10-15 are loaded with exception- specific 
information and MSR bits 16-23, 25-27, and 30-31 are placed into the corresponding bit 
positions of SRR1. Depending on the implementation, additional bits of the MSR may be 
copied to SRR1. 

NOTE: In some implementations, every instruction fetch when MSR[IR] = 1, and every 
data access requiring address translation when MSR[DR] = 1 may modify SRRO 
and SRR1. 


The MSR is 32 bits wide as shown in Figure 6-3. 
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Figure 6-3. Machine State Register (MSR) 

Table 6-5 shows the bit definitions for the MSR. 


Table 6-5. MSR Bit Settings 


Bit(s) 

Name 

Description 

0-12 

— 

Reserved 

13 

POW 

Power management enable 

0 Power management disabled (normal operation mode). 

1 Power management enabled (reduced power mode). 

Note: Power management functions are implementation-dependent. If the function is not 
implemented, this bit is treated as reserved. 

14 

— 

Reserved 
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Table 6-5. MSR Bit Settings (Continued) 


Bit(s) 

Name 

Description 

15 

ILE 

Exception little-endian mode. When an exception occurs, this bit is copied into MSR[LE] to select 
the endian mode for the context established by the exception. 

16 

EE 

External interrupt enable 

0 While the bit is cleared the processor delays recognition of external interrupts and decrementer 
exception conditions. 

1 The processor is enabled to take an external interrupt or the decrementer exception. 

17 

PR 

Privilege level 

0 The processor can execute both user- and supervisor-level instructions. 

1 The processor can only execute user-level instructions. 

18 

FP 

Floating-point available 

0 The processor prevents dispatch of floating-point instructions, including floating-point loads, 
stores, and moves. 

1 The processor can execute floating-point instructions. 

19 

■ 

Machine check enable 

0 Machine check exceptions are disabled. 

1 Machine check exceptions are enabled. 

20 

FEO 

Floating-point exception mode 0 (see Table 2-9). 

21 

SE 

Single-step trace enable (Optional) 

0 The processor executes instructions normally. 

1 The processor generates a single-step trace exception upon the successful execution of the 
next instruction. 

Note: If the function is not implemented, this bit is treated as reserved. 

22 

BE 

Branch trace enable (Optional) 

0 The processor executes branch instructions normally. 

1 The processor generates a branch trace exception after completing the execution of a branch 
instruction, regardless of whether or not the branch was taken. 

Note: If the function is not implemented, this bit is treated as reserved. 

23 

FE1 

Floating-point exception mode 1 (See Table 2-9). 

24 

— 

Reserved 

25 

IP 

Exception prefix. The setting of this bit specifies whether an exception vector offset is prepended 
with Fs or Os. In the following description, nnnnn is the offset of the exception vector. See Table 6-2. 

0 Exceptions are vectored to the physical address 0x000 n_nnnn . 

1 Exceptions are vectored to the physical address 0xFFFn_nnnn. 

In most systems, IP is set to 1 during system initialization, and then cleared to 0 when initialization is 
complete. 

26 

IR 

Instruction address translation 

0 Instruction address translation is disabled. 

1 Instruction address translation is enabled. 

For more information see Chapter 7, “Memory Management.” 

27 

DR 

Data address translation 

0 Data address translation is disabled. 

1 Data address translation is enabled. 

For more information see Chapter 7, “Memory Management.” 

28-29 

— 

Reserved 
















































Table 6-5. MSR Bit Settings (Continued) 


Bit(s) 

Name 

Description 

30 

Rl 

Recoverable exception (for system reset and machine check exceptions). 

0 Exception is not recoverable. 

1 Exception is recoverable. 

For more information see Section 6.4.1, “System Reset Exception (0x001 00), ’’and Section 6.4.2, 
“Machine Check Exception (0x00200).’’ 

31 

LE 

Little-endian mode enable 

0 The processor runs in big-endian mode. 

1 The processor runs in little-endian mode. 


When an exception occurs instruction fetching, dispatching, decoding of instructions stops. 
The processor waits until all previous instructions have completed to a point where no other 
exceptions will be reported. SRRO is loaded with the address where program execution will 
resume when the exception has been processed. SRR1 is loaded with the MSR register 
along with any status bits for this exception. A new value is loaded into the MSR and 
instruction execution resumes at the entry point for the exception handler under the 
influence of the new MSR. 

The data address register (DAR) may be used by several exceptions (for example, DSI and 
alignment exceptions) to identify the address of a memory element. 

6.2.1 Enabling and Disabling Exceptions 

When a condition exists that may cause an exception to be generated, it must be determined 
whether the exception is enabled for that condition as follows: 

• IEEE floating-point enabled exceptions (a type of program exception) are ignored 
when both MSR[FE0] and MSR[FE1] are cleared. If either of these bits is set, all 
IEEE enabled floating-point exceptions are taken and cause a program exception. 

• Asynchronous, maskable exceptions (that is, the external and decrementer 
interrupts) are enabled by setting the MSR[EE] bit. When MSR[EE] = 0, recognition 
of these exception conditions is delayed. MSR[EE] is cleared automatically when an 
exception is taken, to delay recognition of conditions causing those exceptions. 

• A machine check exception can only occur if the machine check enable bit, 
MSR[ME], is set. If MSR[ME] is cleared, the processor goes directly into checkstop 
state when a machine check exception condition occurs. 
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6.2.2 Steps for Exception Processing 

After it is determined that the exception can be taken (by confirming that any instruction- 
caused exceptions occurring earlier in the instruction stream have been handled, and by 
confirming that the exception is enabled for the exception condition), the processor does 
the following: 

1 . The machine status save/restore register 0 (SRRO) is loaded with an instruction 
address that depends on the type of exception. See the individual exception 
description for details about how this register is used for specific exceptions. 
Normally, SRRO contains the address to the first instruction to execute if the 
exception handler resumes program execution. 

2. SRR1 1-4 and 10-15 are loaded with information specific to the exception type. 

3. MSR 16-23, 25-27, and 30-31 are loaded with a copy of the corresponding bits of 
the MSR. 

NOTE: Depending on the implementation, additional bits from the MSR may be 
saved in SRR1. 

4. The MSR is set as described in Table 6-5. The new values take effect beginning with 
the fetching of the first instruction of the exception-handler routine located at the 
exception vector address. 

NOTE: MSR[IR] and MSR[DR] are cleared for all exception types; therefore, 

address translation is disabled for both instruction fetches and data accesses 
beginning with the first instruction of the exception-handler routine. 

Also, the MSR[ILE] bit setting at the time of the exception is copied to 
MSR[LE] when the exception is taken (as shown in Table 6-5). 

5. The MSR[RI] bit is cleared. This indicates that the interrupt handler is operating in 
the “window-of-venerability” and cannot recover if another exception now occurs. 
After the machine state is saved (SRRO and SRR1) and stack pointer has been 
updated, the exception handler sets this bit to indicate that it could now handle 
another exception. See section 6.1. 2.4.1, “System Reset and Machine Check 
Exceptions” for more details. 

6. Instruction fetch and execution resumes, using the new MSR value, at a location 
specific to the exception type. The location is determined by adding the exception's 
vector offset (see Table 6-2) to the base address determined by MSR[IP]. If IP is 
cleared, exceptions are vectored to the physical address 0x000 n_nnnn. If IP is set, 
exceptions are vectored to the physical address OxFFF n_nnnn. For a machine check 
exception that occurs when MSR[ME] = 0 (machine check exceptions are disabled), 
the checkstop state is entered (the machine stops executing instructions). See 
Section 6.4.2, “Machine Check Exception (0x00200).” 

In some implementations, any instruction fetch with MSR[IR] = 1 and any load or store 
with MSR[DR] = 1 may cause SRRO and SRR1 to be modified. 
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6.2.3 Returning from an Exception Handler 

The Return from Interrupt (rfiinstruction performs context synchronization by allowing 

previously issued instructions to complete before returning to the interrupted process. 

Execution of the instruction ensures the following: 

• All previous instructions have completed to a point where they can no longer cause 
an exception. 

If a previous instruction causes a direct-store interface error exception, the results 
are determined before this instruction is executed. However, note that the direct- 
store facility is being phased out of the architecture and will not likely be supported 
in future devices. 

• Previous instructions complete execution in the context (privilege, protection, and 
address translation) under which they were issued. 

• The instruction copies SRR1 bits back into the MSR. 

• The processor branches to the instruction addressed by SRRO and begins program 
execution under control of the MSR bits loaded from SRR1 register. 

For a complete description of context synchronization, refer to Section 6. 1.2.1, “Context 

Synchronization.” 

6.3 Process Switching 

The operating system should execute the following when processes are switched: 

• The sync instruction, which orders the effects of instruction execution. All 
instructions previously initiated appear to have completed before the sync 
instruction completes, and no subsequent instructions appear to be initiated until the 
sync instruction completes. 

• The isync instruction, which waits for all previous instructions to complete and then 
discards any fetched instructions, causing subsequent instructions to be fetched (or 
refetched) from memory and to execute in the context (privilege, translation, 
protection, etc.) established by the previous instructions. 

• The stwcx. instruction, to clear any outstanding reservations, which ensures that an 
lwarx instruction in the old process is not paired with an stwcx. instruction in the 
new process. This is necessary because some implementations of the PowerPC 
architecture do not do an address compare when the stwcx. is executed. Only the 
reservation is required for the stwcx. to be successful. 

The operating system should handle MSR[RI] as follows: 

• In machine check and system reset exception handlers — If the SRR1 bit 
corresponding to MSR[RI] is cleared, the exception is not recoverable. 

• In each exception handler — When enough state information has been saved that a 
machine check or system reset exception can reconstruct the previous state, set 
MSR[RI], 
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• At the end of each exception handler — Clear MSR[RI], set the SRRO and SRR1 
registers appropriately, update stack pointers and then execute . 

NOTE: The RI bit being set indicates that, with respect to the processor, enough 
processor state data is valid for the processor to continue, but it does not 
guarantee that the interrupted process can resume. 

6.4 Exception Definitions 

shows all the types of exceptions that can occur and certain MSR bit settings when the 
exception handler is invoked. Depending on the exception, certain of these bits are stored 
in SRR1 when an exception is taken. The following subsections describe each exception in 
detail. 


Table 6-6. MSR Setting Due to Exception 


Exception Type 

MSR Bit 

POW 

ILE 

EE 

PR 

FP 

ME 

FEO 

SE 

BE 

FE1 

IP 

IR 

DR 

RI 

LE 

System reset 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

Machine check 

0 

— 

0 

0 

0 

0 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

Data access 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

Instruction access 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

External 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

Alignment 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

Program 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

Floating-point 

unavailable 

0 

■ 

0 

0 

0 

■ 

0 

0 

0 

0 

■ 

0 

0 

0 

ILE 

Decrementer 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

System call 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

Trace exception 

0 

— 

0 

0 

0 

— 

0 

0 

0 

0 

— 

0 

0 

0 

ILE 

Floating-point 
assist exception 

0 

■ 

0 

0 

0 

■ 

0 

0 

0 

0 

■ 

0 

0 

0 

ILE 


0 Bit is cleared 

1 Bit is set 

ILE Bit is copied from the ILE bit in the MSR. 

— Bit is not altered 

Reading of reserved bits may return 0, even if the value last written to it was 1 . 
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6.4.1 System Reset Exception (0x00100) 

The system reset exception is a nonmaskable, asynchronous exception signaled to the 
processor typically through the assertion of a system-defined signal; see Table 6-7. 


Table 6-7. System Reset Exception — Register Settings 


Register 

Setting Description 

SRR0 

Set to the effective address of the instruction that the processor would have attempted to execute next if 
no exception conditions were present. 

SRR1 

1-4 Cleared 

10-15 Cleared 

16-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30 Loaded from the equivalent MSR bit, MSR[RI], if the exception is recoverable; 

31 otherwise cleared. 

Loaded with equivalent bit from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

If the processor state is corrupted to the extent that execution cannot resume reliably, the bit 
corresponding to MSR[RI], in SRR1 is cleared. 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FE0 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


When a system reset exception is taken, instruction execution continues at offset 0x00100 
from the physical base address determined by MSRflP]. 

If the exception is recoverable, the value of the MSR[RI] bit is copied to the corresponding 
SRR1 bit. The exception functions as a context-synchronizing operation. If a reset 
exception causes the loss of: 

• An external exception (interrupt or decrementer), 

• Direct-store error type DSI (the direct-store facility is being phased out of the 
Architecture — not likely to be supported in future devices), or 

• Floating-point enabled type program exception, 

then the exception is not recoverable. If the SRR1 bit corresponding to MSR[RI] is cleared, 
the exception is context-synchronizing only with respect to subsequent instructions. 

NOTE: Each implementation provides a means for software to distinguish between 
power-on reset and other types of system resets (such as soft reset). 
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6.4.2 Machine Check Exception (0x00200) 

If no higher-priority exception is pending (namely, a system reset exception), the processor 
initiates a machine check exception when the appropriate condition is detected. 

NOTE: The causes of machine check exceptions are implementation- and system- 
dependent, and are typically signalled to the processor by the assertion of a 
specified signal on the processor interface. 

When a machine check condition occurs and MSRfME] = 1, the exception is recognized 
and handled. If MSRfME] = 0 and a machine check occurs, the processor generates an 
internal checkstop condition. When a processor is in checkstop state, instruction processing 
is suspended and generally cannot continue without resetting the processor. Some 
implementations may preserve some or all of the internal state of the processor when 
entering the checkstop state, so that the state can be analyzed as an aid in problem 
determination. 

In general, it is expected that a bus error signal would be used by a memory controller to 
indicate a memory parity error or an uncorrectable memory ECC error. 

NOTE: The resulting machine check exception has priority over any exceptions caused 
by the instruction that generated the bus operation. 

If a machine check exception causes an exception that is not context-synchronizing, the 
exception is not recoverable. Also, a machine check exception is not recoverable if it causes 
the loss of one of the following: 

• An external exception (interrupt or decrementer) 

• Direct-store error type DSI (the direct-store facility is being phased out of the 
architecture and is not likely to be supported in future devices) 

• Floating-point enabled type program exception 

If the SRR1 bit corresponding to MSR[RI] is cleared, the exception is context- 
synchronizing only with respect to subsequent instructions. If the exception is recoverable, 
the SRR1 bit corresponding to MSR[RI] is set and the exception is context-synchronizing. 

NOTE: If the error is caused by the memory subsystem, incorrect data could be loaded 
into the processor and register contents could be corrupted regardless of whether 
the exception is considered recoverable by the SRR1 bit corresponding to 
MSR[RI], 

On some implementations, a machine check exception may be caused by referring to a 
nonexistent physical (real) address, either because translation is disabled (MSR[IR] or 
MSR[DR] = 0) or through an invalid translation. On such a system, execution of the dcbz 
or dcba instruction can cause a delayed machine check exception by introducing a block 
into the data cache that is associated with an invalid physical (real) address. A machine 
check exception could eventually occur when and if a subsequent attempt is made to store 
that block to memory (for example, as the block becomes the target for replacement, or as 
the result of executing a dcbst instruction). 
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When a machine check exception is taken, registers are updated as shown in Table 6-8. 

Table 6-8. Machine Check Exception — Register Settings 


Register 

Setting Description 

SRR0 

On a best-effort basis, implementations can set this to an EA of some instruction that was 
executing or about to be executing when the machine check condition occurred. 

SRR1 

Bit 30 is loaded from MSR[RI] if the processor is in a recoverable state. Otherwise cleared. The 
setting of all other SRR1 bits is implementation-dependent. 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME* — FE1 0 Rl 0 

EE 0 FE0 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


Note: When a machine check exception is taken, the exception handler should set MSR[ME] as soon as it 
is practical to handle another machine check exception. Otherwise, subsequent machine check excep- 
tions cause the processor to automatically enter the checkstop state. 


If MSR[RI] is set, the machine check exception may still be unrecoverable in the sense that 
execution can resume in the same context that existed before the exception. 

When a machine check exception is taken, instruction execution resumes at offset 0x00200 
from the physical base address determined by MSR[IP]. 


6.4.3 DSI Exception (0x00300) 

A DSI exception occurs when no higher priority exception exists and a data memory access 
cannot be performed. The condition that caused the DSI exception can be determined by 
reading the DSISR, a supervisor- level SPR (SPR18) register that can be read by using the 
mfspr instruction. Bit settings are provided in Table 6-9. Table 6-9 also indicates which 
memory element is pointed to by the DAR. DSI exceptions can be generated by load/store 
instructions, cache-control instructions (icbi, dcbi, dcbz, dcbst, and dcbf), or the 
eciwx/ecowx instructions for any of the following reasons: 

• A load or a store instruction results in a direct-store error exception. 

NOTE: The direct-store facility is being phased out of the architecture and is not 
likely to be supported in future devices. 

• The effective address cannot be translated. That is, there is a page fault for this 
portion of the translation, so a DSI exception must be taken to retrieve the page and 
update the translation tables. For example read a page from a storage device such as 
a hard disk drive. 
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• The instruction is not supported for the type of memory addressed. 

— For lwarx/stwcx.instructions that reference a memory location that is write- 
through required. If the exception is not taken, the instructions execute correctly. 

— For lwarx/stwcx.or eciwx/ecowx instructions that attempt to access direct-store 
segments (direct-store facility is being phased out of the architecture — not likely 
to be supported in future devices). If the exception does not occur, the results are 
boundedly undefined. 

• The access violates memory protection. 

• The execution of an eciwx or ecowx instruction is disallowed because the external 
access register enable bit (EAR[E]) is cleared. 

• A data address breakpoint register (DABR) match occurs. The DABR facility is 
optional to the PowerPC architecture, but if one is implemented, it is recommended, 
but not required, that it be implemented as follows. A data address breakpoint match 
is detected for a load or store instruction if the three following conditions are met for 
any byte accessed: 

— E A [0-28]= DABR[DAB] 

— MSR[DR] = DABR[BT] 

— The instruction is a store and DABR[DW] = 1, or the instruction is a load and 
DABR[DR] = 1. 

The DABR is described in Section 2.3.15, “Data Address Breakpoint Register 
(DABR).” DAR settings are described in Table 6-9. If the above conditions are 
satisfied, it is undefined whether a match occurs in the following cases: 

— The instruction is store conditional but the store is not performed. 

— The instruction is a load/store string of zero length. 

— The instruction is dcbz, eciwx, or ecowx. 

The cache management instructions other than dcbz never cause a match. If dcbz 
causes a match, some or all of the target memory locations may have been updated. 
For the purpose of determining whether a match occurs, eciwx is treated as a load, 
and ecowx and dcbz are treated as stores. 

If an stwcx. instruction has an EA for which a normal store operation would cause a DSI 
exception but the processor does not have the reservation from lwarx whether a DSI 
exception is taken is implementation-dependent. 

If the value in XER[25-31] indicates that a load or store string instruction has a length of 
zero, a DSI exception does not occur, regardless of the effective address. 
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The condition that caused the exception is defined in the DSISR. As shown in Table 6-9, 
this exception also sets the data address register (DAR). 

Table 6-9. DSI Exception — Register Settings 


Register 

Setting Description 

SRRO 

Set to the effective address of the instruction that caused the exception. 

SRR1 

1-4 Cleared 

10-15 Cleared 

1 6-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 Loaded with equivalent bits from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FEO 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 

DSISR 

0 Set if a load or store instruction results in a direct-store error exception; otherwise cleared. 

Note: The direct-store facility is being phased out of the architecture and is not likely to be 
supported in future devices. 

1 Set if the translation of an attempted access is not found in the primary hash table entry group 
(HTEG), or in the rehashed secondary HTEG, or in the range of a DBAT register (page fault 
condition); otherwise cleared. 

2-3 Cleared 

4 Set if a memory access is not permitted by the page or DBAT protection mechanism; otherwise 
cleared. 

5 Set if the eciwx, ecowx, Iwarx, or stwcx. , instruction is attempted to direct-store interface 
space, or if the Iwarx or stwcxinstruction is used with addresses that are marked as write- 
through. Otherwise cleared to 0. 

Note: The direct-store facility is being phased out of the architecture and is not likely to be 
supported in future devices. 

6 Set for a store operation and cleared for a load operation. 

7-8 Cleared 

9 Set if a DABR match occurs. Otherwise cleared. 

10 Cleared 

1 1 Set if the instruction is an eciwx or ecowx and EAR[E] = 0; otherwise cleared. 

12-31 Cleared 

Due to the multiple exception conditions possible from the execution of a single instruction, the 
following combinations of bits of DSISR may be set concurrently: 

• Bits 1 and 1 1 

• Bits 4 and 5 

• Bits 4 and 1 1 

• Bits 5 and 1 1 

Additionally, bit 6 is set if the instruction that caused the exception is a store, ecowx, dcbz, dcba, or 
dcbi and bit 6 would otherwise be cleared. Also, bit 9 (DABR match) may be set alone, or in 
combination with any other bit, or with any of the other combinations shown above. 















Table 6-9. DSI Exception — Register Settings (Continued) 


Register 

Setting Description 

DAR 

Set to the effective address of a memory element as described in the following list: 

• A byte in the first word accessed in the segment or BAT area that caused the DSI exception, for a 
byte, half word, or word memory access (to a segment or BAT area). 

• A byte in the first double word accessed in the segment or BAT area that caused the DSI exception, 
for a double-word memory access (to a segment or BAT area). 

• A byte in the block that caused the exception for a cache management instruction. 

• Any EA in the memory range addressed (for direct-store error exceptions). 

Note: The direct-store facility is being phased out of the architecture and is not likely to be 
supported in future devices. 

• The EA computed by the instruction for the attempted execution of an eciwx or ecowx instruction 
when EAR[E] is cleared. 

• If the exception is caused by a DABR match, the DAR is set to the effective address of any byte in 
the range from A to B inclusive, where A is the effective address of the word (for a byte, half word, or 
word access) or double word (for a double word access) specified by the EA computed by the 
instruction, and B is the EA of the last byte in the word or double word in which the match occurred. 


When a DSI exception is taken, instruction execution resumes at offset 0x00300 from the 
physical base address determined by MSRflP]. 


6.4.4 ISI Exception (0x00400) 

An ISI exception occurs when no higher priority exception exists and an attempt to fetch 
the next instruction to be executed fails for any of the following reasons: 

• The effective address cannot be translated. For example, when there is a page fault 
for this portion of the translation, an ISI exception must be taken to retrieve the page 
(and possibly the translation), typically from a storage device. 

• An attempt is made to fetch an instruction from a no-execute segment. 

• An attempt is made to fetch an instruction from guarded memory and MSRfIR] = 1 . 

• The fetch access violates memory protection. 

• An attempt is made to fetch an instruction from a direct-store segment. 

NOTE: The direct-store facility is being phased out of the architecture and is not 
likely to be supported in future devices. 
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Register settings for ISI exceptions are shown in Table 6-10. 


Table 6-10. ISI Exception — Register Settings 


Register 

Setting Description 

SRR0 

Set to the effective address of the instruction that the processor would have attempted to execute next 
if no exception conditions were present (if the exception occurs on attempting to fetch a branch target, 
SRR0 is set to the branch target address). 

SRR1 

1 Set if the translation of an attempted access is not found in the primary hash 
table entry group (HTEG), or in the rehashed secondary HTEG, or in the 
range of an IBAT register (page fault condition); otherwise cleared. 

Cleared 

2 Set if the fetch access occurs to a direct-store segment (SR[T] = 1), to a no- 

3 execute segment (N bit set in segment descriptor), or to guarded memory 
when MSR[IR] = 1. Otherwise, cleared. 

Note: The direct-store facility is being phased out of the architecture and is 
not likely to be supported in future devices. 

Set if a memory access is not permitted by the page or IBAT protection 

4 mechanism, described in Chapter 7, “Memory Management”; otherwise 
cleared. 

Cleared 

1 0-1 5 Loaded with equivalent bits from the MSR 

16-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 

Note: Only one of 1 , 3, and 4 can be set. 

Also, note that depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FE0 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


When an ISI exception is taken, instruction execution resumes at offset 0x00400 from the 
physical base address determined by MSR[IP]. 

6.4.5 External Interrupt (0x00500) 

An external interrupt exception is signaled to the processor by the assertion of the external 
interrupt signal. The exception may be delayed by other higher priority exceptions or if the 
MSR[EE] bit is zero when the exception is detected. 

NOTE: The occurrence of this exception does not cancel the external request. 
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The register settings for the external interrupt exception are shown in Table 6-11. 


Table 6-11. External Interrupt — Register Settings 


Register 

Setting Description 

SRR0 

Set to the effective address of the instruction that the processor would have attempted to execute next 
if no interrupt conditions were present. 

SRR1 

1-4 Cleared 

10-15 Cleared 

1 6-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 Loaded with equivalent bits from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FE0 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


When an external interrupt exception is taken, instruction execution resumes at offset 
0x00500 from the physical base address determined by MSRflP]. 

6.4.6 Alignment Exception (0x00600) 

This section describes conditions that can cause alignment exceptions in the processor. 
Similar to DSI exceptions, alignment exceptions use the SRR0 and SRR1 to save the 
machine state and the DSISR to determine the source of the exception. An alignment 
exception occurs when no higher priority exception exists and the implementation cannot 
perform a memory access for one of the following reasons: 

• The operand of a floating-point load or store instruction is not word-aligned. 

• The operand of lmw, stmw, lwarx, stwcx. eciwx, or ecowx is not aligned. 

• The instruction is lmw, stmw, lswi, lswx, stswi, or stswx and the processor is in 
little-endian mode. 

• The operand of an elementary or string load or store crosses a protection boundary. 

• The operand of lmw or stmw crosses a segment or BAT boundary. 
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• The operand of dcbz is in memory that is write-through-required or caching 
inhibited, or dcbz is executed in an implementation that has either no data cache or 
a write-through data cache. 

• The operand of a floating-point load or store instruction is in a direct-store segment 
(T=l). 

NOTE: The direct-store facility is being phased out of the architecture and is not 
likely to be supported in future devices. 

For lmw, stmw, lswi, lswx, stswi, and stswx instructions in little-endian mode, an 
alignment exception always occurs. For lmw and stmw instructions with an operand that is 
not aligned in big-endian mode, and for lwarx, stwcx., eciwx, and ecowx with an operand 
that is not aligned in either endian mode, an implementation may yield boundedly- 
undefined results instead of causing an alignment exception (for eciwx and ecowx when 
EAR[E] = 0, a third alternative is to cause a DSI exception). For all other cases listed above, 
an implementation may execute the instruction correctly instead of causing an alignment 
exception. For the dcbz instruction, correct execution means clearing each byte of the block 
in main memory. See Section 3.1, “Data Organization in Memory and Data Transfers,” for 
a complete definition of alignment in the PowerPC architecture. 

The term, ‘protection boundary’, refers to the boundary between protection domains. A 
protection domain is a segment, a block of memory defined by a BAT entry, a virtual 4- 
Kbyte page, or a range of unmapped effective addresses. Protection domains are defined 
only when the corresponding address translation (instruction or data) is enabled (MSRfIR] 
or MSR[DR] = 1). 


The register settings for alignment exceptions are shown in Table 6-12. 


Table 6-12. Alignment Exception — Register Settings 


Register 

Setting Description 

SRRO 

Set to the effective address of the instruction that caused the exception. 

SRR1 

1-4 Cleared 

10-15 Cleared 

1 6-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 Loaded with equivalent bits from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FEO 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 
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Table 6-12. Alignment Exception — Register Settings (Continued) 


Register 

Setting Description 

DSISR 

0-14 Cleared 

1 5-1 6 For instructions that use register indirect with index addressing — set to bits 29-30 of the 
instruction encoding. 

For instructions that use register indirect with immediate index addressing — cleared 

1 7 For instructions that use register indirect with index addressing — set to bit 25 of the instruction 

encoding. 

For instructions that use register indirect with immediate index addressing — set to bit 5 of the 
instruction encoding. 

18-21 For instructions that use register indirect with index addressing — set to bits 21-24 of the 
instruction encoding. 

For instructions that use register indirect with immediate index addressing — set to bits 1-4 of the 
instruction encoding. 

22-26 Set to bits 6-10 (identifying either the source or destination) of the instruction encoding. 

Undefined for dcbz. 

27-31 Set to bits 11-15 of the instruction encoding (rA) for update-form instructions 

Set to either bits 1 1 -1 5 of the instruction encoding or to any register number not in the range of 
registers loaded by a valid form instruction for Imw, Iswi, and Iswx instructions. Otherwise 
undefined. 

Note: For load or store instructions that use register indirect with index addressing, the DSISR can be 
set to the same value that would have resulted if the corresponding instruction uses register indirect 
with immediate index addressing had caused the exception. Similarly, for load or store instructions that 
use register indirect with immediate index addressing, DSISR can hold a value that would have resulted 
from an instruction that uses register indirect with index addressing. For example, a misaligned Iwarx 
instruction that crosses a protection boundary would normally cause the DSISR to be set to the 
following binary value: 

000000000000 00 0 01 0 0101 ttttt ????? 

The value ttttt refers to the destination register and ????? indicates undefined bits. 

However, this register may be set as if the instruction were Iwa, as follows: 

000000000000 10 0 00 0 1101 ttttt ????? 

If there is no corresponding instruction, no alternative value can be specified. 

The instruction pairs that can use the same DSISR values are as follows: 

Ibz/lbzx Ibzu/lbzux Ihz/lhzx ihzu/lhzux lha/lhax lhau/lhaux 

Iwz/lwzx Iwzu/lwzux Iwa/lwax stb/stbx stbu/stbux sth/sthx 

sthu/sthux stw/stwx stwu/stwux Ifs/lfsx Ifsu/lfsux stfs/stfsx 

stfsu/stfsux 

DAR 

Set to the EA of the data access as computed by the instruction causing the alignment exception. 


The architecture does not support the use of a misaligned EA by load/store with reservation 
instructions or by the eciwx and ecowx instructions. If one of these instructions specifies a 
misaligned EA, the exception handler should not emulate the instruction but should treat 
the occurrence as a programming error. 
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6.4.6. 1 Integer Alignment Exceptions 

Operations that are not naturally aligned may suffer performance degradation, depending 
on the processor design, the type of operation, the boundaries crossed, and the mode that 
the processor is in during execution. More specifically, these operations may either cause 
an alignment exception or they may cause the processor to break the memory access into 
multiple, smaller accesses with respect to the cache and the memory subsystem. 

6.4. 6. 1.1 Page Address Translation Access Considerations 

A page address translation access occurs when MSR[DR] is set, SR[T] is cleared, and there 
is no BAT match. 

NOTE: A dcbz instruction causes an alignment exception if the access is to a page or 
block with the W (write-through) or I (cache-inhibit) bit set. 

Misaligned memory accesses that do not cause an alignment exception may not perform as 
well as an aligned access of the same type. The resulting performance degradation due to 
misaligned accesses depends on how well each individual access behaves with respect to 
the memory hierarchy. 

Particular details regarding page address translation is implementation-dependent; the 
reader should consult the user’s manual for the appropriate processor for more information. 

6.4. 6. 1.2 Direct-Store Interface Access Considerations 

The following apply for direct-store interface accesses: 

• If a 256-Mbyte boundary will be crossed by any portion of the direct-store interface 
space accessed by an instruction (the entire string for strings/multiples), an 
alignment exception is taken. 

• Floating-point loads and stores to direct-store segments may cause an alignment 
exception, regardless of operand alignment. 

• The load/store with reservation instructions that map into a direct-store segment 
always cause a DSI exception. However, if the instruction crosses a segment 
boundary an alignment exception is taken instead. 

NOTE: The direct-store facility is being phased out of the architecture and is not likely 
to be supported in future devices. 

6. 4. 6. 2 Little-Endian Mode Alignment Exceptions 

The OEA allows implementations to take alignment exceptions on misaligned accesses (as 
described in Section 3.1.4, “PowerPC Byte Ordering”) in little-endian mode but does not 
require them to do so. Some implementations may perform some misaligned accesses 
without taking an alignment exception. 
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6. 4. 6. 3 Interpretation of the DSISR as Set by an Alignment Exception 

For most alignment exceptions, an exception handler may be designed to emulate the 
instruction that causes the exception. To do this, the handler requires the following 
characteristics of the instruction: 

• Load or store 

• Length (half word or word) 

• String, multiple, or normal load/store 

• Integer or floating-point 

• Whether the instruction performs update 

• Whether the instruction performs byte reversal 

• Whether it is a dcbz instruction 

The PowerPC architecture provides this information implicitly, by setting opcode bits in the 
DSISR that identify the excepting instruction type. The exception handler does not need to 
load the excepting instruction from memory. The mapping for all exception possibilities is 
unique except for the few exceptions discussed below. 

Table 6-13 shows the inverse mapping — how the DSISR bits identify the instruction that 
caused the exception. 

The alignment exception handler cannot distinguish a floating-point load or store that 
causes an exception because it is misaligned, or because it addresses the direct-store 
interface space. However, this does not matter; in either case it is emulated with integer 
instructions. However, floating-point instructions are distinguished from integer 
instructions because different register files must be accessed while emulating the each 
class. Bits 15-21 of the DSISR are used to identify whether the instruction is integer or 
floating-point. 

NOTE: The direct-store facility is being phased out of the architecture and is not likely 
to be supported in future devices. 


Table 6-13. DSISR(15-21) Settings to Determine Misaligned Instruction 


DSISR[15-21] 

Instruction 

DSISR[15-21] 

Instruction 

00 0 0000 

Iwarx, Iwz, special cases 1 

01 1 0101 

— 

00 0 0010 

stw 

10 0 0010 

stwcx. 

00 0 0100 

Ihz 



00 0 0101 

lha 

10 0 1000 

Iwbrx 

00 0 0110 

sth 

100 1010 

stwbrx 

00 0 0111 

Imw 

10 0 1100 

Ihbrx 
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Table 6-13. DSISR(15-21) Settings to Determine Misaligned Instruction (Continued) 


DSISR[15-21] 

Instruction 

DSISR[15-21] 

Instruction 

00 0 1000 

Its 

10 0 1110 

sthbrx 

00 0 1001 

— 

10 1 0100 

eciwx 

00 0 1010 

stfs 

10 1 0110 

ecowx 

00 0 1011 

— 

10 1 1111 

dcbz 

00 0 1101 

Iwa 

1 1 0 0000 

Iwzx 



11 0 0010 

stwx 

00 1 0000 

Iwzu 

11 0 0100 

Ihzx 

00 1 0010 

stwu 

11 0 0101 

lhax 

00 1 0100 

Ihzu 

11 0 0110 

sthx 

00 1 0101 

lhau 

11 0 1000 

Ifsx 

00 1 0110 

sthu 

11 0 1001 

— 

00 1 0111 

stmw 

11 0 1010 

stfsx 

00 1 1000 

Ifsu 

11 0 1011 

— 

00 1 1001 

— 

110 1111 

stfiwx 

00 1 1010 

stfsu 

1 1 1 0000 

Iwzux 

00 1 1011 

— 

11 1 0010 

stwux 



11 1 0100 

Ihzux 



11 1 0101 

lhaux 

01 0 0101 

Iwax 

11 1 0110 

sthux 

01 0 1000 

Iswx 

11 1 1000 

Ifsux 

01 0 1001 

Iswi 

11 1 1001 

— 

01 0 1010 

stswx 

11 1 1010 

stfsux 

01 0 1011 

stswi 

11 1 1011 

— 


The instructions Iwz and Iwarx give the same DSISR bits (all zero). But if Iwarx causes an 
alignment exception, it is an invalid form, so it need not be emulated in any precise way. It is 
adequate for the alignment exception handler to simply emulate the instruction as if it were an 
Iwz. It is important that the emulator use the address in the DAR, rather than computing it 
from rA/rB/D, because Iwz and Iwarx use different addressing modes. 

If opcode 0 (“illegal or reserved”) can cause an alignment exception, it will be indistinguishable 
to the exception handler from Iwarx and Iwz. 
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6.4.7 Program Exception (0x00700) 

A program exception occurs when no higher priority exception exists and one or more of 
the following exception conditions, which correspond to bit settings in SRR1, occur during 
execution of an instruction: 

• System IEEE floating-point enabled exception — A system IEEE floating-point 
enabled exception can be generated when FPSCR[FEX] is set and either (or both) 
of the MSR[FE0] or MSR[FE1] bits is set. 

FPSCR[FEX] is set by the execution of a floating-point instruction that causes an 
enabled exception or by the execution of a “move to FPSCR” type instruction that 
sets an exception bit when its corresponding enable bit is set. Floating-point 
exceptions are described in Section 3.3.6, “Floating-Point Program Exceptions.” 

• Illegal instruction — An illegal instruction program exception is generated when 
execution of an instruction is attempted with an illegal opcode or illegal combination 
of opcode and extended opcode fields (these include PowerPC instructions not 
implemented in the processor), or when execution of an optional or a reserved 
instruction not provided in the processor is attempted. 

NOTE: Implementations are permitted to generate an illegal instruction program 
exception when encountering the following instructions. If an illegal 
instruction exception is not generated, then the alternative is shown in 
parenthesis. 

— An instruction corresponds to an invalid class (the results may be boundedly 
undefined) 

— An lswx instruction for which rA or rB is in the range of registers to be loaded 
(may cause results that are boundedly undefined) 

— A move to/from SPR instruction with an SPR field that does not contain one of 
the defined values 

- MSR[PR] = 1 and spr[0] = 1 (this can cause a privileged instruction program 
exception) 

- MSR[PR] = 0 or spr[0] = 0 (may cause boundedly-undefined results.) 

— An unimplemented floating-point instruction that is not optional (may cause a 
floating-point assist exception) 

• Privileged instruction — A privileged instruction type program exception is 
generated when the execution of a privileged instruction is attempted and the 
processor is operating in user mode (MSR[PR] is set). It is also generated for mtspr 
or mfspr instructions that have an invalid SPR field that contain one of the defined 
values having spr[0] = 1 and if MSR[PR] = 1. Some implementations may also 
generate a privileged instruction program exception if a specified SPR field (for a 
move to/from SPR instruction) is not defined for a particular implementation, but 
spr[0] = 1 ; in this case, the implementation may cause either a privileged instruction 
program exception, or an illegal instruction program exception may occur instead. 
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• Trap — A trap program exception is generated when any of the conditions specified 
in a trap instruction is met. Trap instructions are described in Section 4.2.4.6, “Trap 
Instructions.” 

The register settings when a program exception is taken are shown in Table 6-14. 


Table 6-14. Program Exception — Register Settings 


Register 

Setting Description 

SRR0 

The contents of SRR0 differ according to the following situations, also see SRR1[15]: 

• For all program exceptions except floating-point enabled exceptions when operating in imprecise 
mode (MSR[FE0] MSR[FE1]), SRR0 contains the EA of the accepting instruction. 

• When the processor is in floating-point imprecise mode, SRR0 may contain the EA of the excepting 
instruction or that of a subsequent unexecuted instruction. If the subsequent instruction is sync or 
isync, SRR0 points no more than four bytes beyond the sync or isync instruction. 

• If FPSCR[FEX] = 1 , but IEEE floating-point enabled exceptions are disabled (MSR[FE0] = 

MSR[FE1] = 0), the program exception occurs before the next synchronizing event if an instruction 
alters those bits (thus enabling the program exception). When this occurs, SRR0 points to the 
instruction that would have executed next and not to the instruction that modified MSR. 

SRR1 

1-4 Cleared 

10 Cleared 

1 1 Set for an IEEE floating-point enabled program exception; otherwise cleared. 

1 2 Set for an illegal instruction program exception; otherwise cleared. 

1 3 Set for a privileged instruction program exception; otherwise cleared. 

1 4 Set for a trap program exception; otherwise cleared. 

1 5 Cleared if SRR0 contains the address of the instruction causing the 

exception, and set if SRR0 contains the address of a subsequent instruction. 

1 6-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 Loaded with equivalent bits from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FE0 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


When a program exception is taken, instruction execution resumes at offset 0x00700 from 
the physical base address determined by MSRflP]. 

6.4.8 Floating-Point Unavailable Exception (0x00800) 

A floating-point unavailable exception occurs when no higher priority exception exists, an 
attempt is made to execute a floating-point instruction (including floating-point load, store, 
or move instructions), and the floating-point available bit in the MSR is cleared, 
(MSR[FP] = 0). 
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The register settings for floating-point unavailable exceptions are shown in Table 6-15. 

Table 6-15. Floating-Point Unavailable Exception — Register Settings 


Register 

Setting Description 

SRR0 

Set to the effective address of the instruction that caused the exception. 

SRR1 

1-4 Cleared 

10-15 Cleared 

1 6-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 Loaded with equivalent bits from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FE0 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


When a floating-point unavailable exception is taken, instruction execution resumes at 
offset 0x00800 from the physical base address determined by MSRflP]. 


6.4.9 Decrementer Exception (0x00900) 

A decrementer exception occurs when no higher priority exception exists, a decrementer 
exception condition occurs (for example, the decrementer register has completed 
decrementing), and MSR[EE] = 1. The decrementer register counts down, causing an 
exception request when it passes through zero. A decrementer exception request remains 
pending until the decrementer exception is taken and then it is cancelled. The decrementer 
implementation meets the following requirements: 

• The counters for the decrementer and the time-base counter are driven by the same 
fundamental time base. 

• Loading a GPR from the decrementer does not affect the decrementer. 

• Storing a GPR value to the decrementer replaces the value in the decrementer with 
the value in the GPR. 

• Whenever bit 0 of the decrementer changes from 0 to 1, a decrementer exception 
request is signaled. If multiple decrementer exception requests are received before 
the first can be reported, only one exception is reported. The occurrence of a 
decrementer exception cancels the request. 

• If the decrementer is altered by software and if bit 0 is changed from 0 to 1 , an 
exception request is signaled. 


6-36 


PowerPC Microprocessor Family: The Programming Environments 













The register settings for the decrementer exception are shown in Table 6-16. 


Table 6-16. Decrementer Exception — Register Settings 


Register 

Setting Description 

SRR0 

Set to the effective address of the instruction that the processor would have attempted to execute next 
if no exception conditions were present. 

SRR1 

1-4 Cleared 

10-15 Cleared 

1 6-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 Loaded with equivalent bits from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FE0 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


When a decrementer exception is taken, instruction execution resumes at offset 0x00900 
from the physical base address determined by MSRflP]. 


6.4.10 System Call Exception (OxOOCOO) 

A system call exception occurs when a System Call (sc) instruction is executed. The 
effective address of the instruction following the sc instruction is placed into SRR0. MSR 
bits are saved in SRR1, as shown in Table 6-17. Then a system call exception is generated. 


The system call exception causes the next instruction to be fetched from offset OxOOCOO 
from the physical base address determined by the new setting of MSR[IP]. As with most 
other exceptions, this exception is context-synchronizing. Refer to Section 6. 1.2.1, 
“Context Synchronization,” for more information on the actions performed by a context- 
synchronizing operation. Register settings are shown in Table 6-17. 


Table 6-17. System Call Exception — Register Settings 


Register 


Setting Description 


SRR0 


Set to the effective address of the instruction following the System Call instruction 


SRR1 


0-15 

16-23 

25-27 

30-31 


Undefined 

Loaded with equivalent bits from the MSR 
Loaded with equivalent bits from the MSR 
Loaded with equivalent bits from the MSR 


Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 


MSR 

POW 

0 

FP 

0 

BE 

0 

DR 

0 


ILE 

— 

ME 

— 

FE1 

0 

Rl 

0 


EE 

0 

FE0 

0 

IP 

— 

LE 

Set to value of ILE 


PR 

0 

SE 

0 

IR 

0 
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When a system call exception is taken, instruction execution resumes at offset OxOOCOO 
from the physical base address determined by MSRflP]. 

6.4.11 Trace Exception (OxOODOO) 

The trace exception is optional to the PowerPC architecture, and specific information about 
how it is implemented can be found in user’s manuals for individual processors. 

The trace exception provides a means of tracing the flow of control of a program for 
debugging and performance analysis purposes. It is controlled by MSR bits SE and BE as 
follows: 

• MSRfSE] = 1: the processor generates a single-step type trace exception after each 
instruction that completes without causing an exception or context change (such as 
occurs when an sc, , or a load instruction that causes an exception, for example, is 
executed). 

• MSR[BE] = 1 : the processor generates a branch-type trace exception after 
completing the execution of a branch instruction, whether or not the branch is taken. 

If this facility is implemented, a trace exception occurs when no higher priority exception 
exists and either of the conditions described above exist. The following are not traced: 

• instruction 

• sc, and trap instructions that trap 

• Other instructions that cause exceptions (other than trace exceptions) 

• The first instruction of any exception handler 

• Instructions that are emulated by software 

MSR[SE, BE] are both cleared when the trace exception is taken. In the normal use of this 
function, MSR[SE, BE] are restored when the exception handler returns to the interrupted 
program using an instruction. 
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Register settings for the trace mode are described in Table 6-18. 


Table 6-18. Trace Exception — Register Settings 


Register 

Setting Description 

SRRO 

Set to the effective address of the next instruction to be executed in the program for which the trace 
exception was generated. 

SRR1 

1—4 Cleared (also see user's manuals for individual processors) 

10-15 Cleared (ditto) 

1 6-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 Loaded with equivalent bits from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FEO 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


When a trace exception is taken, instruction execution resumes at offset OxOODOO from the 
base address determined by MSR[IP]. 
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6.4.12 Floating-Point Assist Exception (OxOOEOO) 

The floating-point assist exception is optional to the PowerPC architecture. It can be used 
to allow software to assist in the following situations: 

• Execution of floating-point instructions for which an implementation uses software 
routines to perform certain operations, such as those involving denormalization. 

• Execution of floating-point instructions that are not optional and are not 
implemented in hardware. In this case, the processor may generate an illegal 
instruction type program exception instead. 

Register settings for the floating-point assist exceptions are described in Table 6-19. 


Table 6-19. Floating-Point Assist Exception — Register Settings 


Register 

Setting Description 

SRRO 

Set to the address of the next instruction to be executed in the program for which the floating-point 
assist exception was generated. 

SRR1 

1-4 Implementation-specific information 

10-15 Implementation-specific information 

16-23 Loaded with equivalent bits from the MSR 

25-27 Loaded with equivalent bits from the MSR 

30-31 Loaded with equivalent bits from the MSR 

Note: Depending on the implementation, additional bits in the MSR may be copied to SRR1 . 

MSR 

POW 0 FP 0 BE 0 DR 0 

ILE — ME — FE1 0 Rl 0 

EE 0 FEO 0 IP — LE Set to value of ILE 

PR 0 SE 0 IR 0 


When a floating-point assist exception is taken, instruction execution resumes as offset 
OxOOEOO from the base address determined by MSR[IP]. 
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Chapter 7. Memory Management 


This chapter describes the memory management unit (MMU) specifications provided by 
the PowerPC operating environment architecture (OEA) for PowerPC processors. The 
primary function of the MMU in a PowerPC processor is to translate logical (effective) 
addresses to physical addresses (referred to as real addresses in the architecture 
specification) for memory accesses and I/O accesses (most I/O accesses are assumed to be 
memory-mapped). In addition, the MMU provides various levels of access protection on a 
segment, block, or page basis. 

NOTE: There are many aspects of memory management that are implementation- 
specific. This chapter describes the conceptual model of a PowerPC MMU; 
however, PowerPC processors may differ in the specific hardware used to 
implement the MMU model of the OEA, depending on the many design trade- 
offs inherent in each implementation. 



Two general types of accesses generated by PowerPC processors require address 
translation — instruction accesses, and data accesses to memory generated by load and store 
instructions. In addition, the addresses specified by cache instructions and the optional 
external control instructions also require translation. Generally, the address translation 
mechanism is defined in terms of segment descriptors and page tables used by PowerPC 
processors to locate the effective to physical address mapping for instruction and data 
accesses. The segment information translates the effective address to an interim virtual 
address, and the page table information translates the virtual address to a physical address. 


The definition of the segment and page table data structures provides significant flexibility 
for the implementation of performance enhancement features in a wide range of processors. 
Therefore, the performance enhancements used to store the segment or page table 
information on-chip vary from implementation to implementation. 


Translation lookaside buffers (TLBs) are commonly implemented in PowerPC processors 
to keep recently-used page address translations on-chip. Although their exact 
characteristics are not specified in the OEA, the general concepts that are pertinent to the 
system software are described. 


The segment information, used to generate the interim virtual addresses, is stored as 
segment descriptors. These descriptors reside in on-chip segment registers. 
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The block address translation (BAT) mechanism is a software-controlled array that stores 
the available block address translations on-chip. BAT array entries are implemented as pairs 
of 32-bit BAT registers that are accessible as supervisor special-purpose registers (SPRs). 

The MMU, together with the exception processing mechanism, provides the necessary 
support for the operating system to implement a paged virtual memory environment and for 
enforcing protection of designated memory areas. Exception processing is described in 
Chapter 6, “Exceptions.” Section 2.3.1, “Machine State Register (MSR),” describes the 
MSR, which controls some of the critical functionality of the MMU. 

NOTE: The architecture specification refers to exceptions as interrupts. 

7.1 MMU Features 

The MMU of a PowerPC processor provides 4 Gbytes of effective address space, a 52-bit 
interim virtual address and physical addresses that are < 32 bits in length. 

This chapter describes address translation mechanisms from the perspective of the 
programming model. As such, it describes the structure of the page and segment tables, the 
MMU conditions that cause exceptions, the instructions provided for programming the 
MMU, and the MMU registers. The hardware implementation details of a particular MMU 
(including whether the hardware automatically performs a page table search in memory) 
are not contained in the architectural definition of PowerPC processors and are invisible to 
the PowerPC programming model; therefore, they are not described in this document. In 
the case that some of the OEA model is implemented with some software assist mechanism, 
this software should be contained in the area of memory reserved for implementation- 
specific use and should not be visible to the operating system. 

7.2 MMU Overview 

The PowerPC MMU and exception models support demand-paged virtual memory. Virtual 
memory management permits execution of programs larger than the size of physical 
memory; the term demand paged implies that individual pages are loaded into physical 
memory from backing storage only as they are accessed by an executing program. 

The memory management model includes the concept of a virtual address that is not only 
larger than that of the maximum physical memory allowed but a virtual address space that 
is also larger than the effective address space. Effective addresses are 32 bits wide. In the 
address translation process, the processor converts an effective address to 52-bit virtual 
address, as per the information in the selected descriptor. Then the address is translated 
back to a physical address the size (or less) of the effective address. 

For implementations that support a physical address range that is smaller than 32 bits, the 
higher-order bits of the effective address cannot be ignored in the address translation 
process. The remainder of this chapter assumes that implementations support the maximum 
physical address range. 
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The operating system manages the system’s physical memory resources. Consequently, the 
operating system initializes the MMU registers (segment registers, BAT registers, and 
SDR1 register) and sets up page tables in memory appropriately. The MMU then assists the 
operating system by managing page status and optionally caching the recently-used address 
translation information on-chip for quick access. 

Effective address spaces are divided into 256-Mbyte regions called segments for virtual 
addressing or into other large regions called blocks (128 Kbyte-256 Mbyte) and use the 
BAT registers for translation. Segments that correspond to virtual memory can be further 
subdivided into 4-Kbyte pages. For programs using virtual addressing only the most 
recently used 4-Kbyte pages need be resident in memory whereas programs using block 
address translation, the total block (128-256 Mbyte) must be resident in memory. 

For each page, the operating system creates an address descriptor (page table entry (PTE)). 
The MMU then uses this descriptor to generate the physical address, the protection 
information, and other access control information each time an address within the page is 
accessed. Address descriptors for 4kbyte pages reside in page tables in memory and are 
cached in TFBs on chip for quick translation. 

For each block the operating system creates an address descriptor in one of the four BAT 
array entries. The MMU then uses this descriptor to generate the physical address, the 
protection information, and other access control information each time an address within 
the block is accessed. The MMU keeps the address descriptors for blocks on-chip in the 
BAT array (comprised of the BAT registers). 

This section provides an overview of the high-level organization and operational concepts 
of the MMU in PowerPC processors, and a summary of all MMU control registers. For 
more information about the MSR, see Section 2.3.1, “Machine State Register (MSR).” 
Section 7.4.3, “BAT Register Implementation of BAT Array,” describes the BAT registers. 
Section 7. 5. 2.1, “Segment Descriptor Definitions,” describes the segment registers. 
Section 7. 6. 1.1, “SDR1 Register Definitions,” describes the SDR1. 

7.2.1 Memory Addressing 

A program references memory using the effective (logical) address computed by the 
processor when it executes a load, store, branch, or cache instruction, and when it fetches 
the next instruction. The effective address is translated to a physical address (real) 
according to the procedures described throughout this chapter. The memory subsystem uses 
the physical address for the access. For a complete discussion of effective address 
calculation, see Section 4. 1.4.2, “Effective Address Calculation.” 

7.2.1 .1 Predefined Physical Memory Locations 

There are four areas of the physical memory map that have predefined uses. The first 256 
bytes of physical memory (or if MSR[IP] = 1, the first 256 bytes of memory located at 
physical address 0xFFF0_0000are assigned for arbitrary use by the operating system. The 
rest of that first page of physical memory defined by the vector base address (determined 
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by MSR[IP]) is either used for exception vectors, or reserved for future exception vectors. 
The third predefined area of memory consists of the second and third physical pages of the 
memory map, which are used for implementation- specific purposes. In some 
implementations, the second and third pages located at physical address 
0xFFF0_ 1000 when MSRflP] = 1 are also used for implementation- specific purposes. 
Fourthly, the system software defines the locations in physical memory that contain the 
page address translation tables. 

These predefined memory areas are summarized in Table 7-1 in terms of the variable ‘Base’ 
and Table 7-2 decodes the actual value of ‘Base’. 

Refer to Chapter 6, “Exceptions,” for more detailed information on the assignment of the 
exception vector offsets. 


Table 7-1. Predefined Physical Memory Locations 


Memory Area 

Physical Address Range 

Predefined Use 

1 

Base || 0x0_0000-Base || 0x0_00FF 

Operating system 

2 

Base || 0x0_01 00-Base || 0x0_0FFF 

Exception vectors 

3 

Base || 0x0_1 000-Base || 0x0_2FFF 

Implementation-specific 1 

4 

Software-specified — contiguous 
sequence of physical pages 

Page table 


1 0nly valid tor MSR[IP] = 1 on some implementations 


Table 7-2. Value of Base for Predefined Memory Use 


MSR[IP] 

Value of Base 

0 

Base = 0x000 

1 

Base = OxFFF 


7.2.2 MMU Organization 

Figure 7-1 shows a conceptual block diagram of the MMU. After an address is generated, 
the higher-order bits of the effective address, EA0-EA19 (or a smaller set of address bits, 
EAO-EAn, in the cases of blocks), are translated into physical address bits PA0-PA19. The 
lower-order address bits, A20-A31 are untranslated and therefore identical for both 
effective and physical addresses. After translating the address, the MMU passes the 
resulting 32-bit physical address to the memory subsystem. 
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PA0-PA31 


Figure 7-1. MMU Conceptual Block Diagram 
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7.2.3 Address Translation Mechanisms 

PowerPC processors support the following three types of address translation: 

• Page address translation — translates the page frame address for a 4-Kbyte page size 

• Block address translation — translates the block number for blocks that range in size 
from 128 Kbyte to 256 Mbyte 

• Real addressing mode — when address translation is disabled, the effective address 
is used (identical) as the physical address. 

In addition, earlier processors implement a direct-store facility that is used to generate 
direct-store interface accesses on the external bus. 

NOTE: This facility is not optimized for performance, was present for compatibility with 
POWER devices, and is being phased out of the architecture. Future devices are 
not likely to support it; software should not depend on its effects and new 
software should not use it. 

Figure 7-2 shows the address translation mechanisms provided by the MMU. The segment 
descriptors shown in the figure control both the page and direct-store segment address 
translation mechanisms. When an access uses the page or direct-store segment address 
translation, the appropriate segment descriptor is required. One of the 16 on-chip segment 
registers (which contain segment descriptors) is selected by the 4 high-order effective 
address bits. 

A control bit in the corresponding segment descriptor then determines if the access is to 
memory (includes memory-mapped) or to a direct-store segment. 

NOTE: The direct-store interface is present to allow certain older I/O devices to use this 
interface. When an access is determined to be to the direct-store interface space, 
the implementation invokes an elaborate hardware protocol for communication 
with these devices. The direct-store interface protocol is not optimized for 
performance, and therefore, its use is discouraged. The most efficient method for 
accessing I/O is by memory-mapping the I/O areas. 

For memory accesses translated by a segment descriptor, the interim virtual address is 
generated using the information in the segment descriptor. Page address translation 
corresponds to the conversion of this virtual address into the 32-bit physical address used 
by the memory subsystem. In some cases, the physical address for the page resides in an 
on-chip TFB and is available for quick access. However, if the page address translation 
misses in a TFB, the MMU searches the page table in memory (using the virtual address 
information and a hashing function) to locate the required physical address. Some 
implementations may have dedicated hardware to perform the page table search 
automatically, while others may define an exception handler routine that searches the page 
table with software. 
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Block address translation occurs in parallel with segment address translation but differs in 
that BAT translations is a one step process. Also more high order bits from the effective 
address are used in the comparison (as few as 4 and as many as 15-bits). Instead of segment 
descriptors and a page table, block address translations use the on-chip BAT registers as a 
BAT array and an associative search is made in the array. If an effective address matches 
one of the corresponding fields in a BAT register, the information in that register is used to 
generate the high-order physical address. When a BAT translation is successful, the results 
of the page translation (occurring in parallel) are ignored. 

NOTE: A matching BAT array entry takes precedence over a translation provided by the 
segment descriptor in all cases (even if the segment is a direct-store segment). 

Direct-store address translation is used when the optional direct-store translation control bit 
(T bit) in the corresponding segment descriptor is set. In this case, the remaining 
information in the segment descriptor is interpreted as identifier information that is used 
with the remaining effective address bits to generate the protocol used in a direct-store 
interface access on the external interface; additionally, no TLB lookup or page table search 
is performed. 

NOTE: This facility is not likely to be supported in future processors. 

When the processor generates an access, and the corresponding address translation enable 
bit in MSR is cleared, the effective address is used as the physical address and all other 
translation mechanisms are ignored. Instruction and data address translation is enabled with 
the MSRflR] and MSR[DR] bits, respectively. 

See Section 7.2.6. 1, “Real Addressing Mode and Block Address Translation Selection,” for 
more information. 
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Figure 7-2. Address Translation Types 
7.2.4 Memory Protection Facilities 

In addition to the translation of effective addresses to physical addresses, the MMU 
provides access protection of supervisor areas from user access and can designate areas of 
memory as read-only as well as no-execute. Table 7-3 shows the eight protection options 
supported by the MMU for pages. 
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Table 7-3. Access Protection Options for Pages 


Option 

User Read 

User 

Write 

Supervisor Read 

Supervisor 

Write 

1-Fetch 

Data 

1-Fetch 

Data 

Supervisor-only 

— 

— 

— 

y 


y 

Supervisor-only-no-execute 

— 

— 

— 

— 


y 

Supervisor-write-only 

y 

y 

— 

y 


y 

Supervisor-write-only-no-execute 

— 

y 

— 

— 


y 

Both user/supervisor 

y 

y 

y 

y 


y 

Both user/supervisor-no-execute 

— 

y 

y 

— 


y 

Both read-only 

y 

y 

— 

y 


— 

Both read-only-no-execute 

— 

y 

— 

— 


— 


y Access permitted y 

— Protection violation 


The no-execute option provided in the segment descriptor lets the operating system 
determine whether or not instruction fetches are allowed from an area of memory. The 
remaining options are enforced based on a combination of information in the segment 
descriptor and the page table entry. Thus, the supervisor-only option allows only read and 
write operations generated while the processor is operating in supervisor mode (MSRfPR] 
= 0) to access the page. User accesses that map into a supervisor-only page cause an 
exception. 

NOTE: Independent of the protection mechanisms, care must be taken when writing to 
instruction areas as coherency must be maintained with on-chip copies of 
instructions that may have been prefetched into a queue or an instruction cache. 
Refer to Section 5. 1.5. 2, “Instruction-Cache Instructions,” for more information 
on coherency within instruction areas. 

As shown in the table, the supervisor-write-only option allows both user and supervisor 
accesses to read from the page, but only supervisor programs can write to that area. There 
is also an option that allows both supervisor and user programs read and write access (both 
user/supervisor option), and finally, there is an option to designate a page as read-only, both 
for user and supervisor programs (both read-only option). 

For areas of memory that are translated by the block address translation mechanism, the 
protection options are similar, except that blocks are translated by separate mechanisms for 
instruction and data, blocks do not have a no-execute option, and blocks can be designated 
as enabled for user and supervisor accesses independently. Therefore, a block can be 
designated as supervisor-only, for example, but this block can be programmed such that all 
user accesses simply ignore the block translation, rather than take an exception in the case 
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of a match. This allows a flexible way for supervisor and user programs to use overlapping 
effective address space areas that map to unique physical address areas (without exceptions 
occurring). 

For direct-store segments, the MMU calculates a key bit based on the protection values 
programmed in the segment descriptor and the specific user/supervisor and read/write 
information for the particular access. However, this bit is merely passed on to the system 
interface to be transmitted in the context of the direct-store interface protocol. The MMU 
does not itself enforce any protection or cause any exception based on the state of the key 
bit for these accesses. The I/O controller device or other external hardware can optionally 
use this bit to enforce any protection required. 

NOTE: The direct-store facility is being phased out of the architecture and future devices 
are not likely to implement it. 

Finally, a facility defined in the VEA and OEA allows pages or blocks to be designated as 
guarded, thus preventing out-of-order (a.k.a. out-of- sequence) accesses that may cause 
undesired side effects. For example, areas of the memory-map that are used to control I/O 
devices can be marked as guarded so that accesses (instruction stores) do not occur out-of- 
order thus starting an I/O operation before all other control information has been received 
by the device. Refer to Section 5. 2. 1.5.3, “Out-of-Order Accesses to Guarded Memory,” for 
a complete description of how accesses to guarded memory are restricted. 

7.2.5 Page History Information 

The MMU of PowerPC processors also defines referenced (R) and changed (C) bits in the 
page address translation mechanism that can be used as history information relevant to the 
usage of a page. The C bit is used by the operating system to determine which pages have 
changed and must be written back to disk when new pages are replacing them in main 
memory. The R bit is used to determine that a reference (e.g. Load instruction) has been 
made to a page and the operating system can use this information when trying to decide 
which page not to remove from memory. While these bits are initially allocated by the 
operating system into the page table, the architecture specifies that the R and C bits are 
updated by the processor when a program executes a load (R) or store (C) to a page. 

7.2.6 General Flow of MMU Address Translation 

The following sections describe the general flow used by PowerPC processors to translate 
effective addresses to virtual and then physical addresses. 

NOTE: Although there are references to the concept of an on-chip TLB, these entities 
may not be present in a particular hardware implementation for performance 
enhancement (and a particular implementation may have one or more TLBs). 
Thus, they are shown here as optional and only the software ramifications of the 
existence of a TLB are discussed. 
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7.2.6. 1 Real Addressing Mode and Block Address Translation 
Selection 

When an instruction or data access is generated and the corresponding instruction or data 
translation is disabled (MSR[IR] = 0 or MSR[DR] = 0), real addressing mode translation is 
used (physical address equals effective address) and the access continues to the memory 
subsystem as described in Section 7.3, “Real Addressing Mode.” 

Figure 7-3 shows the flow the MMU uses in determining whether to select real addressing 
mode (no translation), block address translation (BAT), or the segment descriptor (virtual 
translation) when addressing the memory subsystem. 


Effective Address 
Generated 


l-access 


D-access 


Instruction 
Translation Disabled 
(MSR[I R] = 0) 

Perform Real 
Addressing Mode 
Translation (EA=PA) 



Instruction 
Translation Enabled 
(MSR[IR] = 1 




Data 

Translation Enabled 
(MSR[DR] = 1) 


■'Compare Address with 
Instruction or Data BAT 
Array (as appropriate) 


Data 

Translation Disabled 
(MSR[DR] = 0) 

Perform Real 
Addressing Mode 
Translation (EA=PA) 


(See Figure 7-6) 


BAT Array' 
Miss 


BAT Array 
Hit 


(See Figure 7-11) 


Perform Address Translation 
with Segment Descriptor 

(see Figure 7-4) 


Access 

Protected 



Access 

Permitted 


( 


Access Faulted 


) 


2 


Translate Address 


Continue Access 
to Memory 
Subsystem 


Figure 7-3. General Flow of Address Translation 

NOTE: If the BAT array search results in a hit, the access is qualified with the appropriate 
protection bits. If the access is determined to be protected (not allowed), an 
exception (ISI or DSI exception) is generated. 

7. 2. 6. 2 Page and Direct-Store Address Translation Selection 

If address translation is enabled (real addressing mode translation not selected) and the 
effective address information does not match with a BAT array entry, then the segment 
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descriptor must be located. Once the segment descriptor is located, the T bit in the segment 
descriptor selects whether the translation is to a page or to a direct-store segment as shown 
in Figure 7-4. In addition. Figure 7-4 also shows the way in which the no-execute 
protection is enforced; if the N bit in the segment descriptor is set and the access is an 
instruction fetch, the access is faulted. 

The segment descriptor for an access is contained in one of 16 on-chip segment registers; 
effective address bits EA0-EA3 select one of the 16 segment registers. 
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Notes: 

* Not allowed for instruction accesses 
(causes ISI exception) 

Implementation-specific 


Figure 7-4. General Flow of Page and Direct-Store Address Translation 
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7. 2. 6. 2.1 Selection of Page Address Translation 

If the T bit in the selected segment descriptor (bit[0]) is 0, page address translation method 
is used. The information in the segment descriptor is used to generate the 52-bit virtual 
address. The virtual address is used to identify the page address translation information 
(stored as 2- word entries (PTEs) in a page table in memory). Once again, although the 
architecture does not require the existence of a TLB, one or more TLBs may be 
implemented in the hardware to store copies of recently-used PTEs on-chip for increased 
performance. A TLB is used like a small cache of the much larger PTE tables in memory. 

If an access hits in the TLB, the page translation occurs and the physical address bits are 
forwarded to the memory subsystem. If the translation is not found in the TLB, the MMU 
requires a search of the page table. The hardware of some implementations may perform 
the table search automatically, while others may trap to an exception handler for the system 
software to perform the page table search. If the translation is found, a new TLB entry is 
created and the page translation is once again attempted. This time, the TLB is guaranteed 
to hit. When the PTE is located, the access is qualified with the appropriate protection bits. 
If the access is determined to be protected (not allowed), an exception (ISI or DSI 
exception) is generated. 

If the PTE is not found by the table search operation, an ISI or DSI exception is generated. 
This is also known as a page fault. 

7. 2. 6. 2. 2 Selection of Direct-Store Address Translation 

When the segment descriptor has the T bit set, the access is considered a direct-store access 
and the direct-store interface protocol of the external interface is used to perform the access. 
The selection of address translation type differs for instruction and data accesses only in 
that instruction accesses are not allowed from direct-store segments; attempting to fetch an 
instruction from a direct-store segment causes an ISI exception. 

NOTE: This facility is not optimized for performance, was present for compatibility with 
POWER devices, and is being phased out of the architecture. Luture devices are 
not likely to support it; software should not depend on its effects and new 
software should not use it. See Section 7.7, “Direct-Store Segment Address 
Translation,” for more detailed information about the translation of addresses in 
direct-store segments in those processors that implement this. 

7.2.7 MMU Exceptions Summary 

In order to complete any memory access, the effective address must be translated to a 
physical address. A translation exception condition occurs if this translation fails for one of 
the following reasons: 

• There is no valid entry in the page table in memory for the virtual address generated 
from the effective address and the segment descriptor and no BAT translation 
occurs. 


7-14 


PowerPC Microprocessor Family: The Programming Environments 



• An address translation is found but the access is not allowed by the memory 
protection mechanism. 

The translation exception conditions cause either the ISI or the DSI exception to be taken 
as shown in Table 7-4. The state saved by the processor for each of these exceptions 
contains information that identifies the address of the failing instruction. Refer to 
Chapter 6, “Exceptions,” for a more detailed description of exception processing, and the 
bit settings of SRR1 and DSISR when an exception occurs. 


Table 7-4. Translation Exception Conditions 


Condition 

Description 

Exception 

Page fault (no PTE found) 

No matching PTE found in page tables (and no 
matching BAT array entry) 

1 access: ISI exception 

SRR1 [1] = 1 

D access: DSI exception 
DSISR[1] = 1 

Block protection violation 

Conditions described in Table 7-10 for block 

1 access: ISI exception 

SRR1 [4] = 1 

D access: DSI exception 
DSISR[4] = 1 

Page protection violation 

Conditions described in Table 7-20 for page 

1 access: ISI exception 

SRR1 [4] = 1 

D access: DSI exception 
DSISR[4] = 1 

No-execute protection violation 

Attempt to fetch instruction when SR[N] = 1 

ISI exception 

SRR1 [3] = 1 

Instruction fetch from direct-store 
segment — note that the direct- 
store facility is optional and being 
phased out of the architecture. 

Attempt to fetch instruction when SR[T] = 1 

ISI exception 

SRR1 [3] = 1 

Instruction fetch from guarded 
memory 

Attempt to fetch instruction when MSR[IR] = 1 
and either: 

matching xBAT[G] = 1 , or 
no matching BAT entry and PTE[G] = 1 

ISI exception 

SRR1 [3] = 1 


In addition to the translation exceptions, there are other MMU-related conditions (some of 
them implementation- specific) that can cause an exception to occur. These conditions map 
to the exceptions as shown in Table 7-5. The only MMU exception conditions that occur 
when MSR[DR] = 0 are those that cause the alignment exception for data accesses. For 
more detailed information about the conditions that cause the alignment exception (in 
particular for string/multiple instructions), see Section 6.4.6, “AlignmentException” 
(0x00600).” Refer to Chapter 6, “Exceptions, ’’for a complete description of the SRR1 and 
DSISR bit settings for these exceptions 
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Table 7-5. Other MMU Exception Conditions 


Condition 

Description 

Exception 

dcbz with W = 1 or 1 = 1 (may cause 
exception or operation may be 
performed to memory) 

dcbz instruction to write-through 
or cache-inhibited segment or 
block 

Alignment exception 
(implementation-dependent) 

Iwarx or stwcx. with W = 1 (may 
cause exception or execute correctly) 

Reservation instruction to write- 
through segment or block 

DSI exception (implementation- 
dependent) DSISR[5] = 1 

Iwarx, stwcx., eciwx, or ecowx 

instruction to direct-store segment 
(may cause exception or may produce 
boundedly-undefined results) — note 
that the direct-store facility is optional 
and being phased out of the 
architecture 

Reservation instruction or 
external control instruction when 
SR[T] = 1 

DSI exception (implementation- 
dependent) 

DSISR[5] = 1 

Floating-point load or store to direct- 
store segment (may cause exception 
or instruction may execute 
correctly) — note that the direct-store 
facility is optional and being phased 
out of the architecture 

Floating-point memory access 
when SR[T] = 1 

Alignment exception 
(implementation-dependent) 

Load or store operation that causes a 
direct-store error — note that the direct- 
store facility is optional and being 
phased out of the architecture 

Direct-store interface protocol 
signalled with an error condition 

DSI exception 

DSISR[0] = 1 

eciwx or ecowx attempted when 
external control facility disabled 

eciwx or ecowx attempted with 
EAR[E] = 0 

DSI exception 

DSISR[1 1] = 1 

Imw, stmw, Iswi, Iswx, stswi, or 
stswx instruction attempted in little- 
endian mode 

Imw, stmw, Iswi, Iswx, stswi, or 
stswx instruction attempted 
while MSR[LE] = 1 

Alignment exception 

Operand misalignment 

Translation enabled and operand 
is misaligned as described in 
Chapter 6, “Exceptions.” 

Alignment exception (some of these 
cases are implementation- 
dependent) 


7.2.8 MMU Instructions and Register Summary 

By using the MMU instructions and registers, the operating systems establishes the total 
framework for address translation. This in part includes loading BAT registers, segment 
registers, SDR1 address register and allocating areas in memory for page table and BAT 
program and data areas, etc. 

NOTE: Because the implementation of TLB is optional, the instructions that refer to this 
structure are also optional. However, as these structures serve as caches of the 
page table, there must be a software protocol for maintaining coherency between 
these caches (TLBs) and the tables in memory whenever changes are made to the 
tables in memory. Therefore, the PowerPC OEA specifies that a processor 
implementing a TLB is guaranteed to have a means for doing the following: 
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• Invalidating an individual TLB entry 

• Invalidating the entire TLB 

When the tables in memory are changed, the operating system purges these caches of the 
corresponding entries, allowing the translation caching mechanism to re-fetch from the 
tables when the corresponding entries are required. 

A processor may implement one or more of the instructions described in this section to 
support table invalidation. Alternatively, an algorithm may be specified that performs one 
of the functions listed above (a loop invalidating individual TLB entries may be used to 
invalidate the entire TLB, for example), or different instructions may be provided. 

A processor may also perform additional functions (not described here) as well as those 
described in the implementation of some of these instructions. For example, the tlbie 
instruction may be implemented so as to purge all TLB entries in a congruence class (that 
is, all TLB entries indexed by the specified EA which can include corresponding entries in 
data and instruction TLBs) or the entire TLB. 

NOTE: If a processor does not implement an optional instruction it treats the instruction 
as a no-op or as an illegal instruction, depending on the implementation. Also, 
note that the segment register and TLB concepts described here are conceptual; 
that is, a processor may implement parallel sets of segment registers (and even 
TLBs) for instructions and data. 

Because the MMU specification for PowerPC processors is so flexible, it is recommended 
that the software that uses these instructions and registers be encapsulated into subroutines 
to minimize the impact of migrating across the family of implementations. 

Table 7-6 summarizes the PowerPC instructions that specifically control the MMU. For 
more detailed information about the instructions, refer to Chapter 8, “Instruction set.” 


Table 7-6. Instruction Summary — Control MMU 


Instruction 

Description 

mtsr SR,rS 

Move to Segment Register 


SR[SR]<— rS 

mtsrin rS,rB 

Move to Segment Register Indirect 


SR[rB[0-3]]<— rS 

mfsr rD,SR 

Move from Segment Register 


rD<— SR[SR] 

mfsrin rD,rB 

Move from Segment Register Indirect 


rD< — SR[rB[0— 3]] 

tibia 

Translation Lookaside Buffer Invalidate All 

(optional) 

For all TLB entries, TLB[V]<-0 


Causes invalidation of TLB entries only for processor that executed the tibia 
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Table 7-6. Instruction Summary — Control MMU (Continued) 


Instruction 

Description 

tlbie rB 

(optional) 

Translation Lookaside Buffer Invalidate Entry 

If TLB hit (for effective address specified as rB), TLB[V]<— 0 

Causes TLB invalidation of entry in all processors in system 

tlbsync 

(optional) 

Translation Lookaside Buffer Synchronize 

Ensures that all tlbie instructions previously executed by the processor executing the tlbsync 
instruction have completed on all processors 


Table 7-7 summarizes the registers that the operating system uses to program the MMU. 
These registers are accessible to supervisor-level software only (supervisor level is referred 
to as privileged state in the architecture specification). These registers are described in 
detail in Chapter 2, “PowerPC Register Set.” 


Table 7-7 MMU Registers 


Register 

Description 

Segment registers 
(SR0-SR1 5) 

The sixteen 32-bit segment registers are present in the PowerPC architecture. 

Figure 7-13 shows the format of a segment register. The fields in the segment 
register are interpreted differently depending on the value of bit 0. The segment 
registers are accessed by the mtsr, mtsrin, mfsr, and mfsrin instructions. 

BAT registers 
(IBAT0U-IBAT3U, 
IBAT0L-IBAT3L, 
DBAT0U-DBAT3U, and 
DBAT0L-DBAT3L) 

There are 16 BAT registers, organized as four pairs of instruction BAT registers 
(IBAT0U-IBAT3U paired with IBAT0L-IBAT3L) and four pairs of data BAT registers 
(DBAT0U-DBAT3U paired with DBAT0L-DBAT3L). The BAT registers are defined as 
32-bit registers. These are special-purpose registers that are accessed by the mtspr 
and mfspr instructions. 

SDR1 register 

The SDR1 register specifies the base and size of the page tables in memory. SDR1 
is defined as a 32-bit register. This is a special-purpose register that is accessed by 
the mtspr and mfspr instructions. 


7.2.9 TLB Entry Invalidation 

Optionally, PowerPC processors implement TLB structures that store on-chip copies of the 
PTEs that are resident in physical memory. These processors have the ability to invalidate 
resident TLB entries through the use of the tlbie and tibia instructions. Additionally, these 
instructions may also enable a TLB invalidate signalling mechanism in hardware so that 
other processors also invalidate their resident copies of the matching PTE. See Chapter 8, 
“Instruction set,” for detailed information about the tlbie and tibia instructions. 

7.3 Real Addressing Mode 

If address translation is disabled (MSR[IR] = 0 or MSR[DR] = 0) for a particular access, 
the effective address is treated as the physical address and is passed directly to the memory 
subsystem as a real addressing mode address translation. If an implementation has a smaller 
physical address range than effective address range, the extra high-order bits of the effective 
address may be ignored in the generation of the physical address. 
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Section 2.3.17, “Synchronization Requirements for Special Registers and for Lookaside 
Buffers,” describes the synchronization requirements for changes to MSR[IR] and 
MSR[DR], 

The addresses for accesses that occur in real addressing mode bypass all memory protection 
checks as described in Section 7.4.4, “Block Memory Protection,” and Section 7.5.4, “Page 
Memory Protection” and do not cause the recording of referenced and changed information 
(described in Section 7.5.3, “Page History Recording”). 

For data accesses that use real addressing mode, the memory access mode bits (WIMG) are 
assumed to be ObOOll. That is, the cache is write-back and memory does not need to be 
updated immediately (W = 0), caching is enabled (I = 0), data coherency is enforced with 
memory, I/O, and other processors (caches) (M = 1, so data is global), and the memory is 
guarded. For instruction accesses in real addressing mode, the memory access mode bits 
(WIMG) are assumed to be either ObOOOl or ObOOll . That is, caching is enabled (I = 0) and 
the memory is guarded. Additionally, coherency may or may not be enforced with memory, 
I/O, and other processors (caches) (M = 0 or 1, so data may or may not be considered 
global). For a complete description of the WIMG bits, refer to Section 5.2.1, 
“Memory/Cache Access Attributes.” 

NOTE: The attempted execution of the eciwx or ecowx instructions while MSR[DR] = 
0 causes boundedly-undefined results. 

Whenever an exception occurs, the processor clears both the MSR[IR] and MSR[DR] bits. 
Therefore, at least at the beginning of all exception handlers (including reset), the processor 
operates in real addressing mode for instruction and data accesses. If address translation is 
required for the exception handler code, the software must explicitly enable address 
translation by accessing the MSR as described in Chapter 2, “PowerPC Register Set.” 

NOTE: An attempt to access a physical address that is not physically present in the 

system may cause a machine check exception (or even a checkstop condition), 
depending on the response by the memory system for this case. Thus, care must 
be taken when generating addresses in real addressing mode. 

This can also occur when translation is enabled and the SDR1 register sets up the 
translation such that nonexistent memory is accessed. 

See Section 6.4.2, “Machine Check Exception (0x00200)” for more information 
on machine check exceptions. 

7.4 Block Address Translation 

The block address translation (BAT) mechanism in the OEA provides a way to map ranges 
of effective addresses larger than a single page into contiguous areas of physical memory. 
Such areas can be used for data that is not subject to normal virtual memory handling 
(paging), such as a memory-mapped display buffer or an extremely large array of numerical 
(or any type) data. 
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The following sections describe the implementation of block address translation in 
PowerPC processors, including the block protection mechanism, followed by a block 
translation summary with a detailed flow diagram. 

7.4.1 BAT Array Organization 

The block address translation mechanism in PowerPC processors is implemented as a 
software-controlled BAT array. The BAT array maintains the address translation 
information for eight blocks of memory. The BAT array in PowerPC processors is 
maintained by the system software and is implemented as a set of 16 special-purpose 
registers (SPRs). Each block is defined by a pair of SPRs called upper and lower BAT 
registers that contain the effective and physical addresses for the block. 

The BAT registers can be read from or written to by the mfspr and mtspr instructions; 
access to the BAT registers is privileged. Section 7.4.3, “BAT Register Implementation of 
BAT Array,” gives more information about the BAT registers. 

NOTE: The BAT array entries are completely ignored for TLB invalidate operations 
detected in hardware and in the execution of the tlbie or tibia instruction. 

Figure 7-5 shows the organization of the BAT array. Four pairs of BAT registers are 
provided for translating instruction addresses and four pairs of BAT registers are used for 
translating data addresses. These eight pairs of BAT registers comprise two four-entry 
fully-associative BAT arrays (each BAT array entry corresponds to a pair of BAT registers). 
The BAT array is fully-associative in that any address can reside in any BAT. In addition, 
the effective address field of all four corresponding entries (instruction or data) is 
simultaneously compared with the effective address of the access to check for a match 
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Unmasked bits of EA0-EA1 14, MSR[PR] 
Instruction Accesses 


BEPI, 



SPR 528 


SPR 535 


Unmasked bits of EA0-EA1 14, MSR[PR] 



SPR 536 


SPR 543 


Figure 7-5. BAT Array Organization 

Each pair of BAT registers defines the starting address of a block in the effective address 
space, the size of the block, and the start of the corresponding block in physical address 
space. If an effective address is within the range defined by a pair of BAT registers, its 
physical address is defined as the starting physical address of the block plus the lower-order 
effective address bits. 

Blocks are restricted to a finite set of sizes, from 128 Kbytes (2 17 bytes) to 256 Mbytes (2 28 
bytes). The starting address of a block in both effective address space and physical address 
space is defined as a multiple of the block size. 

It is an error for system software to program the BAT registers such that an effective address 
is translated by more than one valid IB AT pair or more than one valid DBAT pair. If this 
occurs, the results are undefined and may include a spurious violation of the memory 
protection mechanism, a machine check exception, or a checkstop condition. 

The equation for determining whether a BAT entry is valid for a particular access is as 
follows: 

BAT_entry_valid = (Vs & -MSR[PR]) I (Vp & MSR[PR]) 
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If a BAT entry is not valid for a given access, it does not participate in address translation 
for that access. Two BAT entries may not map an overlapping effective address range and 
be valid at the same time. 

Entries that have complementary settings of V[s] and V[p] may map overlapping effective 
address blocks. Complementary settings would be as follows: 

BAT entry A: Vs = 1, Vp = 0 
BAT entry B: Vs = 0, Vp = 1 

7.4.2 Recognition of Addresses in BAT Arrays 

The BAT arrays are accessed in parallel with segmented address translation to determine 
whether a particular effective address corresponds to a block defined by the BAT arrays. If 
an effective address is within a valid BAT area, the segmented address translation is 
canceled and the physical address for the memory access is determined as described in 
Section 7.4.5, “Block Physical Address Generation.” 

Block address translation is enabled only when address translation is enabled 
(MSRflR] = 1 and/or MSRfDR] = 1). Also, a matching BAT array entry always takes 
precedence over any segment descriptor translation, independent of the setting of the SR[T] 
bit, and the segment descriptor information is completely ignored. 

Figure 7-6 shows the flow of the BAT array comparison used in block address translation. 
When an instruction fetch operation is required, the effective address is compared with the 
four instruction BAT array entries; similarly, the effective addresses of data accesses are 
compared with the four data BAT array entries. The BAT arrays are fully-associative in that 
any of the four instruction or data BAT array entries can contain a matching entry (for an 
instruction or data access, respectively). 

NOTE: Figure 7-6 assumes that the protection bits, BATF[PP], allow an access to occur. 

If not, an exception is generated, as described in Section 7.4.4, “Block Memory 
Protection.” 
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with BAT Array 


Instruction Access 
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Figure 7-6. BAT Array Hit/Miss Flow 

Two BAT array entry fields are compared to determine if there is a BAT array hit — a block 
effective page index (BEPI) field, which is compared with the high-order effective address 
bits, and one of two valid bits (Vs or Vp), which is evaluated relative to the value of 
MSR[PR], 

NOTE: Figure 7-6 assumes a block size of 128 Kbytes (all bits of BEPI are used in the 
comparison); the actual number of bits of the BEPI field that are used are masked 
by the BL field (block length) as described in Section 7.4.3, “BAT Register 
Implementation of BAT Array.” 
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Thus, the specific criteria for determining a BAT array hit are as follows: 

• The upper-order 15 bits of the effective address, subject to a mask, must match the 
BEPI field in one of the BAT array entries. 

• The appropriate valid bit in the BAT array entry must be set to one as follows: 

— MSRfPR] = 0 corresponds to supervisor mode; in this mode, Vs is checked. 

— MSRfPR] = 1 corresponds to user mode; in this mode, Vp is checked. 

The matching entry is then subject to the protection checking described in Section 7.4.4, 
“Block Memory Protection,” before it is used as the source for the physical address. 

NOTE: If a user mode program performs an access with an effective address that 
matches the BEPI field of a BAT area defined as valid only for supervisor 
accesses (Vp = 0 and Vs = 1) for example, the BAT mechanism does not generate 
a protection violation and the BAT entry is simply ignored. Thus, a supervisor 
program can use the block address translation mechanism to share a portion of 
the effective address space with a user program (that uses page address 
translation for this area). 

If a memory area is to be mapped by the BAT mechanism for both instruction and data 
accesses, the mapping must be set up in both an IBAT and DBAT entry; this is the case even 
on implementations that do not have separate instruction and data caches. 

NOTE: A block can be defined to overlay part of a segment such that the block portion 
is nonpaged although the rest of the segment can be paged. This allows nonpaged 
areas to be specified within a segment. Thus, if an area of memory is translated 
by an instruction BAT entry and data accesses are not also required to that same 
area of memory, PTEs are not required for that area of memory. Similarly, if an 
area of memory is translated by a data BAT entry, and instruction accesses are 
not also required to that same area of memory, PTEs are not required for that area 
of memory. 

7.4.3 BAT Register Implementation of BAT Array 

Recall that the BAT array is comprised of four entries used for instruction accesses and four 
entries used for data accesses. Each BAT array entry has 64 bits and consists of a pair of 
BAT 32 bit registers — an upper and a lower BAT register for each entry. The BAT registers 
are accessed with the mtspr and mfspr instructions and are only accessible to supervisor- 
level programs. See Appendix F, “Simplified Mnemonics,” for a list of simplified 
mnemonics for use with the BAT registers. 

NOTE: Simplified mnemonics are referred to as extended mnemonics in the architecture 
specification. 
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The format and bit definitions of the upper and lower BAT registers are shown in Figure 7-7 
and Figure 7-8, respectively. 

[ | Reserved 


BEPI 

0 0 0 0 

BL 

Vs 

Vp 


0 14 15 18 19 29 30 31 

Figure 7-7. Format of Upper BAT Registers 


| | Reserved 


BRPN 

0 0000 0000 0 

WIMG* 

0 

PP 


0 14 15 24 25 28 29 30 31 

*W and G bits are not defined for IBAT registers. Attempting to write to these bits causes boundedly-undefined results. 


Figure 7-8. Format of Lower BAT Registers 

The BAT registers contain the effective-to-physical address mappings for blocks of 
memory. This mapping information includes the effective address bits that are compared 
with the effective address of the access, the memory/cache access mode bits (WIMG), and 
the protection bits for the block. 

In addition, the size of the block and the starting address of the block are defined by the 
physical block number (BRPN) and block size mask (BL) fields. 

NOTE: The W and G bits are defined for BAT registers that translate data accesses 
(DBAT registers); attempting to write to the W and G bits in IBAT registers 
causes boundedly-undefined results 
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Table 7-8 describes the bits in the upper and lower BAT registers. 


Table 7-8. BAT Registers — Field and Bit Descriptions for 32-Bit Implementations 


Upper/Lower 

BAT 

Bits 

Name 

Description 

Upper BAT 
Register 

0-14 

BEPI 

Block effective page index. This field is compared with high-order bits of 
the logical address to determine if there is a hit in that BAT array entry. 

(Note that the architecture specification refers to logical address as 
effective address.) 


15-18 

— 

Reserved 


19-29 

BL 

Block length. BL is a mask that encodes the size of the block. Values for 
this field are listed in Table 2-1 2. 


30 

Vs 

Supervisor mode valid bit. This bit interacts with MSR[PR] to determine if 
there is a match with the logical address. For more information, see 

Section 7.4.2, “Recognition of Addresses in BAT Arrays." 


31 

Vp 

User mode valid bit. This bit also interacts with MSR[PR] to determine if 
there is a match with the logical address. For more information, see 

Section 7.4.2, “Recognition of Addresses in BAT Arrays.” 

Lower BAT 
Register 

0-14 

BRPN 

This field is used in conjunction with the BL field to generate high-order 
bits of the physical address of the block. 


15-24 

— 

Reserved 


25-28 

WIMG 

Memory/cache access mode bits 

W Write-through 

1 Caching-inhibited 

M Memory coherence 

G Guarded 

Attempting to write to the W and G bits in IBAT registers causes 
boundedly-undefined results. For detailed information about the WIMG 
bits, see Section 5.2.1, “Memory/Cache Access Attributes." 


29 

— 

Reserved 


30-31 

PP 

Protection bits for block. This field determines the protection for the block 
as described in Section 7.4.4, “Block Memory Protection." 


The BL field in the upper BAT register is a mask that encodes the size of the block. 
Table 7-9 defines the bit encodings for the BL field of the upper BAT register. 

Table 7-9. Upper BAT Register Block Size Mask Encodings 


Block Size 

BL Encoding 

1 28 Kbytes 

000 0000 0000 

256 Kbytes 

000 0000 0001 

512 Kbytes 

000 0000 001 1 

1 Mbyte 

000 0000 01 1 1 

2 Mbytes 

000 0000 1111 

4 Mbytes 

000 0001 1111 
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Table 7-9. Upper BAT Register Block Size Mask Encodings (Continued) 


Block Size 

BL Encoding 

8 Mbytes 

000 0011 1111 

16 Mbytes 

000 0111 1111 

32 Mbytes 

000 1111 1111 

64 Mbytes 

001 1111 1111 

128 Mbytes 

011 1111 1111 

256 Mbytes 

111 1111 1111 


Only the values shown in Table 7-9 are valid for BL. An effective address is determined to 
be within a BAT area if the appropriate bits (determined by the BL field) of the effective 
address match the value in the BEPI field of the upper BAT register, and if the appropriate 
valid bit (Vs or Vp) is set. 

NOTE: For an access to occur, the protection bits (PP bits) in the lower BAT register 
must be set appropriately, as described in Section 7.4.4, “Block Memory 
Protection.” 

The BL field selects the bits of the effective address that are used in the comparison with 
the BEPI field. The 11 bit BL field is aligned with the effective address bits EA[4-14], For 
every zero in the BL field the corresponding bit of the effective address is use in the 
comparison. For every one in the BL field the corresponding bit of the EA is zeroed. 
Effective address bits EA[0-3] are always used. The 15 bits selected are compared to the 
BEPI for a match. 

The value loaded into the BL field determines both the size of the block and the alignment 
of the block in physical address space. The values loaded into the BEPI and BRPN fields 
must have at least as many low-order zeros as there are ones in BL. Otherwise, the results 
are undefined. Also, if the processor does not support 32 bitsof physical address, the system 
software should write zeros to those unsupported bits in the BRPN field (as the 
implementation treats them as reserved). Otherwise, a machine check exception can occur. 

7.4.4 Block Memory Protection 

When the selected bits of the effective address match the BEPI in the BAT array and the 
valid bit is set for the current mode (Supervisor or User), the access is checked for validity 
by the memory protection mechanism. If this protection mechanism prohibits the access, a 
block protection violation exception condition (DSI or ISI exception) is generated. 

The memory protection mechanism allows selectively granting read access, granting 
read/write access, and prohibiting access to areas of memory based on a number of control 
criteria. The block protection mechanism provides protection at the granularity defined by 
the block size (128 Kbyte to 256 Mbyte). 
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As the memory protection mechanism used by the block and page address translation is 
different, refer to Section 7.5.4, “Page Memory Protection,” for specific information unique 
to page address translation. 

For block address translation, the memory protection mechanism is controlled by the PP 
bits (which are located in the lower BAT register), which define the access options for the 
block. Table 7-10 shows the types of accesses that are allowed for the possible PP bit 
combinations. 


Table 7-10. Access Protection Control for Blocks 


PP 

Accesses Allowed 

00 

No access 

xl 

Read only 

10 

Read/write 


Thus, any access attempted (read or write) when PP = 00 results in a protection violation 
exception condition. When PP = xl, an attempt to perform a write access causes a 
protection violation exception condition, and when PP = 10, all accesses are allowed. When 
the memory protection mechanism prohibits a reference, one of the following occurs, 
depending on the type of access that was attempted: 

• For data accesses, a DSI exception is generated and bit 4 of DSISR is set. 

• For instruction accesses, an ISI exception is generated and SRR1 is set. 

See Chapter 6, “Exceptions,” for more information about these exceptions. 
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Table 7-11 shows a summary of the conditions that cause exceptions for supervisor and 
user read and write accesses within a BAT area. Each BAT array entry is programmed to be 
either used or ignored for supervisor and user accesses via the BAT array entry valid bits, 
and the PP bits enforce the read/ write protection options. 

NOTE: The valid bits (Vs and Vp) are used as part of the match criteria for a BAT array 
entry and are not explicitly part of the protection mechanism. 


Table 7-11. Access Protection Summary for BAT Array 


Vs 

Vp 

PP 

Field 

Block Type 

User Read 

User Write 

Supervisor 

Read 

Supervisor 

Write 

0 

0 

XX 

No BAT array match 

Not used 

Not used 

Not used 

Not used 

0 

1 

00 

User — no access 

Exception 

Exception 

Not used 

Not used 

0 

1 

xl 

User-read-only 

y 

Exception 

Not used 

Not used 

0 

1 

10 

User read/write 

y 

y 

Not used 

Not used 

1 

0 

00 

Supervisor — no access 

Not used 

Not used 

Exception 

Exception 

1 

0 

xl 

Supervisor-read-only 

Not used 

Not used 

y 

Exception 

1 

0 

10 

Supervisor read/write 

Not used 

Not used 

y 

y 

1 

1 

00 

Both — no access 

Exception 

Exception 

Exception 

Exception 

1 

1 

xl 

Both-read-only 

y 

Exception 

y 

Exception 

1 

1 

10 

Both read/write 

y 

y 

y 

y 


Note: The term ‘Not used’ implies that the access is not translated by the BAT array and is translated by the 
page address translation mechanism described in Section 7.5, “Memory Segment Model,” instead. 


NOTE: Because access to the BAT registers is privileged, only supervisor programs can 
modify the protection and valid bits or any other bits in the BAT for the block. 
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Figure 7-9 expands on the actions taken by the processor in the case of a memory protection 
violation. 

NOTE: The debt and debtst instructions do not cause exceptions; in the case of a 
memory protection violation for the attempted execution of one of these 
instructions, the translation is aborted and the instruction executes as a no-op (no 
violation is reported). 

Refer to Chapter 6, “Exceptions,” for a complete description of the SRR1 and 
DSISR bit settings for the protection violation exceptions. 



Figure 7-9. Memory Protection Violation Flow for Blocks 
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7.4.5 Block Physical Address Generation 

Access to the physical memory within the block is made according to the memory/cache 
access mode defined by the WIMG bits in the lower BAT register. These bits apply to the 
entire block rather than to an individual page as described in Section 5.2.1, 
“Memory/Cache Access Attributes.” 


0 3 4 14 15 31 



Figure 7-10. Block Physical Address Generation 
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7.4.6 Block Address Translation Summary 

Figure 7-11 is an expansion of the ‘BAT Array Hit’ branch of Figure 7-3 and shows the 
translation of address bits. 

NOTE: Figure 7-11 does not show when many of the exceptions in Table 7-5 are 
detected or taken as this is implementation- specific. 



Figure 7-11. Block Address Translation Flow 

7.5 Memory Segment Model 

A large virtual memory address space (52-bit address) in the PowerPC OEA is divided into 
256-Mbyte segments. This segmented memory model provides a way to map programs into 
unique virtual address spaces which are farther subdivided into 4-Kbyte pages. Each 4- 
Kbyte virtual page is allocated a 4-Kbyte physical memory location based on needs of the 
program. 

A page address translation may be superseded by a matching block address translation as 
described in Section 7.4, “Block Address Translation.” If not, the page translation proceeds 
in the following two steps: 

1 . from effective address to the virtual address (which never exists as a specific entity 
but can be considered to be the concatenation of the virtual segment ID (VSID), the 
page index and the byte offset within a page), and 

2. from virtual address to physical address. 
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The page address translation mechanism is described in the following sections, followed by 
a summary of page address translation with a detailed flow diagram. 

7.5.1 Address Translation via Segment Descriptors 

If the effective address is not translated via the BAT function, the segment descriptors are 
used. If the T bit is set the translation proceeds for the Direct-store segment. Otherwise, a 
virtual address is generated which ultimately maps to a physical address. Segment 
Descriptors also contain protection control bits and in the case of direct-store segments, bus 
unit or controller information. 

Segments in the OEA can be classified as one of the following two types: 

• Memory segment — An effective address in these segments generates a virtual 
address that is mapped to a physical address via the page table entry (PTE) facility. 

• Direct-store segment — References made to direct-store segments do not use the 
virtual paging mechanism of the processor. This facility allows direct 
communication with I/O devices on the System Bus. 

NOTE: The direct-store facility is optional and being phased out of the 
architecture. See Section 7.7, “Direct-Store Segment Address 
Translation,” for a complete description of the mapping of direct-store 
segments for those processors that implement it. 

The T bit in the segment descriptor selects between memory segments and direct-store 
segments, as shown in Table 7-12. 


Table 7-12. Segment Descriptor Types 


Segment Descriptor 

T Bit 

Segment Type 

0 

Memory segment 

1 

Direct-store segment — optional, but being phased 
out of the architecture. Its use is discouraged. 


7.5.1 .1 Selection of Memory Segments 

All accesses generated by the processor can be mapped to a segment descriptor; however, 
if translation is disabled (MSR[IR] = 0 or MSR[DR] = 0 for an instruction or data access, 
respectively), real addressing mode is performed as described in Section 7.3, “Real 
Addressing Mode.” Otherwise, if T = 0 in the corresponding segment descriptor (and the 
address is not translated by the BAT mechanism), the access maps to virtual memory space 
and page address translation is performed. 

After a memory segment is selected, the processor creates the virtual address for the 
segment and searches for the PTE that dictates the physical page number to be used for the 
access. Note that I/O devices can be easily mapped onto memory space and used as 
memory-mapped I/O. 
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7.5. 1.2 Selection of Direct-Store Segments 

As described for memory segments, all accesses generated by the processor (with 
translation enabled) map to a segment descriptor. If T = 1 for the selected segment 
descriptor, the access maps to the direct-store interface space and the access proceeds as 
described in Section 7.7, “Direct-Store Segment Address Translation.” Because the direct- 
store interface is present only for compatibility with existing I/O devices that used this 
interface and because the direct-store interface protocol is not optimized for performance, 
its use is discouraged. Additionally, the direct-store facility is being phased out of the 
architecture and future processors are not likely to support it. Thus, software should not 
depend on its results and new software should not use it. A more common method for 
accessing I/O is by mapping memory segments on I/O devices (memory mapped I/O). 

7.5.2 Page Address Translation Overview 

The translation of effective addresses to physical addresses is shown in Figure 7-12: 

• Bits 0-3 of the effective address comprise the segment register number used to select 
a segment descriptor, from which the virtual segment ID (VSID) is extracted. 

• Bits 4-19 of the effective address define the page number (index) within the 
segment; these bits are concatenated with the VSID from the segment descriptor to 
form the virtual page number (VPN). The VPN is used to search for the PTE in the 
TLB. If the VPN is not in the TBL a search is made of the page table in main 
memory. The PTE then provides the physical page number (a.k.a. real page number 
or RPN). 

• Bits 20-31 of the effective address are the byte offset within the page; these are 
concatenated with the real page number (RPN) field of a PTE to form the physical 
(real) address used to access memory. 
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Figure 7-12. Page Address Translation Overview 
7.5.2. 1 Segment Descriptor Definitions 


The fields in the segment descriptors are interpreted differently depending on the value of 
the T bit within the descriptor. When T = 1, the Segment descriptor defines a direct-store 
segment, and the format is as described in Section 7.7.1, “Segment Descriptors for Direct- 
Store segments.” 

7. 5. 2. 1.1 Segment Descriptor Format 

The segment descriptors are 32 bits long and reside in one of 16 segment registers. 
Figure 7-13 shows the format of a segment register used in page address translation (T = 0). 


1 | Reserved 


D 



D 

000 0 

VSID 

0 

i 

2 

3 

4 7 8 

31 


Figure 7-13. Segment Register Format for Page Address Translation. 
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Table 7-13 provides the corresponding bit definitions of the segment register. 


Table 7-13. Segment Register Bit Definition for Page Address Translation 


Bit 

Name 

Description 

0 

T 

T = 0 selects this format 

1 

Ks 

Supervisor-state protection key 

2 

Kp 

User-state protection key 

3 

N 

No-execute protection bit 

4-7 

— 

Reserved 

8-31 

VSID 

Virtual segment ID 


The Ks and Kp bits partially define the access protection for the pages within the 
segment. The page protection provided in the PowerPC OEA is described in Section 7.5.4, 
“Page Memory Protection.” 

The virtual segment ID field is used as the high- order bits of the virtual page number (VPN) 
as shown in Figure 7-12. 

The segment registers are accessed with specific instructions that read and write them. 
However, since the segment registers described here are merely a conceptual model, a 
processor may implement separate segment register files each containing 16 registers for 
instructions and for data. In this case, it is the responsibility of the system (either hardware 
or software) to maintain the consistency between the multiple sets of segment register files. 

The segment register instructions are summarized in Table 7-14. These instructions are 
privileged in that they are executable only while operating in supervisor mode. See 
Section 2.3.17, “Synchronization Requirements for Special Registers and for Lookaside 
Buffers,” for information about the synchronization requirements when modifying the 
segment registers. See Chapter 8, “Instruction set,” for more detail on the encodings of 
these instructions. 


Table 7-14. Segment Register Instructions 


Instruction 

Description 

mtsr SR,rS 

Move to Segment Register 

SR[SR]<— rS 

mtsrin rS,rB 

Move to Segment Register Indirect 
SR[rB[0-3]]<-rS 

mfsr rD,SR 

Move from Segment Register 
rD<— SR[SR] 

mtsrin rD,rB 

Move from Segment Register Indirect 
rD<— SR[rB[0-3]] 
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7. 5. 2. 2 Page Table Entry (PTE) Definitions 

Page table entries (PTEs) are generated and placed in the page table in memory by the 
operating system using the hashing algorithm described in Section 7. 6. 1.3, “Page Table 
Hashing Functions.” The PowerPC OEA defines each 64-bits. 

• Word 0: 

• The valid bit V is bit 0. A one in this bit indicates the PTE is valid. 

• The virtual segment ID field is 24-bits and is found in bits 1-24. 

• The hash bit H is found in bit 25 . 

• The API field is 6-bits and is found in bits 26-3 1 . These bits are from the high order 
6-bits of the page index. See Figure 7-12. 

• Word 1: 

• The RPN field is 20-bits and is found in bits 33-51. It contains the physical (real) 
page number. 

• The R and C bits are found in bits 55-56 of the PTE and maintain history information 
for the page as described in Section 7.5.3, “Page History Recording.” 

• The WIMG field is 4-bits and is found in bits 57-60 of the PTE and defines the 
memory/cache control mode for accesses to the page. 

• The PP bits are found in bits 62-63 of the PTE and defines the remaining access 
protection constraints for the page. The page protection provided by PowerPC 
processors is described in Section 7.5.4, “Page Memory Protection.” 

The first 32 bits contain the valid bit V, the virtual segment ID (VSID), the hash bit H, and 
the abbreviated page index (API). These 32-bits are used as match criteria when searching 
through the PTE entries looking for a match to a virtual address. 

Conceptually, the page table in memory must be searched to translate the address of every 
reference. For performance reasons, however, some processors use TLBs to cache copies 
of recently-used PTEs so that the table search time is eliminated for most accesses. In this 
case, the TLB is searched for the address translation first. If a copy of the PTE is found, 
then no page table search is performed. As TLBs are noncoherent caches of PTEs, software 
that changes the page table in any way must perform the appropriate TLB invalidate 
operations to keep the TLBs coherent with respect to the page table in memory. 
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7.5.2. 2.1 PTE Format 

Figure 7-14 shows the format of the two words that comprise a PTE. 


| | Reserved 

01 24 25 26 31 


D 

VSID 

B 

API 

RPN 

000 

B 

B 

WIMG 

D 

pp 


0 19 20 22 23 24 25 28 29 30 31 

Figure 7-14. Page Table Entry Format 

Table 7-15 lists the corresponding bit definitions for each word in a PTE as defined above. 


Table 7-15. PTE Bit Definitions 


Word 

Bit 

Name 

Description 

0 

0 

V 

Entry valid (V = 1 ) or invalid (V = 0) 

1-24 

VSID 

Virtual segment ID 

25 

H 

Hash function identifier 

26-31 

API 

Abbreviated page index 

1 

0-19 

RPN 

Physical page number 

20-22 

— 

Reserved 

23 

R 

Referenced bit 

24 

C 

Changed bit 

25-28 

WIMG 

Memory/cache control bits 

29 

— 

Reserved 

30-31 

PP 

Page protection bits 


7.5.3 Page History Recording 

Referenced (R) and changed (C) bits reside in each PTE to keep history information about 
the page. The operating system then uses this information to determine which areas of 
memory to write back to disk when new pages must be allocated in main memory. 
Referenced and changed recording is performed only for accesses made with page address 
translation and not for translations made with the BAT mechanism or for accesses that 
correspond to direct-store (T = 1) segments. Furthermore, R and C bits are maintained only 
for accesses made while address translation is enabled (MSR[IR] = 1 or MSR[DR] = 1). 
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In general, the referenced and changed bits are updated to reflect the status of the page 
based on the access, as shown in Table 7-16. 

Table 7-16. Table Search Operations to Update History Bits 


R and C bits 

Processor Action 

00 

Page Has not been referenced 

01 

Combination doesn’t occur 

10 

Page has been referenced but not modified 

11 

Page has been modified 


The processor uses the R and C bits to determine at a later time which pages in memory can 
be replaced with pages on disk. Because user programs and their data can be much larger 
than the space available in memory, only a small fraction of the total address space of a 
program might be resident in main memory in the form of 4k pages. On a page fault the 
system needs to remove a page from memory. Pages with no R or C bit set will be removed 
first. A new page can simply be read in over unused pages. The PTE in memory must be 
updated to reflect the removal on one page and the loading of another. The set of pages with 
only the R bit set become the next candidate for removal. Finally, if only pages with both 
R and C bit set remain in memory, then these pages are swapped. When the C bit is set it 
indicates that a data item in the page has been modified, these pages must be written to disk 
before the new page from disk can be read into it’s space. 

The R bit for a page may be set by the execution of the debt or debtst instruction to that 
page. However, neither of these instructions cause the C bit to be set. 

7.5.3. 1 Referenced Bit 

The referenced bit for each real page is located in the PTE. Every time a page is referenced 
(by an instruction fetch, or any other read access) the referenced bit is set in the page table. 
The referenced bit may be set immediately, or the setting may be delayed until the memory 
access is determined to be successful. Because the reference to a page is what causes a PTE 
to be loaded into the TLB, some processors may assume the R bit in the TLB is always set. 
The processor never automatically clears the referenced bit. 

The referenced bit is only a hint to the operating system about the activity of a page. At 
times, the referenced bit may be set although the access was not logically required by the 
program or even if the access was prevented by memory protection. Examples of this 
include the following: 

• Fetching of instructions not subsequently executed 

• Accesses generated by an lswx or stswx instruction with a zero length 

• Accesses generated by a stwex. instruction when no store is performed 

• Accesses that cause exceptions and are not completed 
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7.5. 3. 2 Changed Bit 

The changed bit for each virtual page is located both in the PTE in the page table and in the 
copy of the PTE loaded into the TLB (if a TLB is implemented). Whenever a data store 
instruction is executed successfully, if the TLB search (for page address translation) results 
in a hit, the changed bit in the matching TLB entry is checked. If it is already set, no 
additional action is required. If the TLB changed bit is 0, it is set and a table search 
operation is performed to set the C bit in the corresponding PTE in the page table. 

Processors cause the changed bit (in both the PTE in the page tables and in the TLB if 
implemented) to be set only when a store operation is allowed by the page memory 
protection mechanism and the store is guaranteed to be in the execution path, unless an 
exception, other than those caused by one of the following occurs: 

• System-caused interrupts (system reset, machine check, external, and decrementer 
interrupts) 

• Lloating-point enabled exception type program exceptions when the processor is in 
an imprecise mode 

• Lloating-point assist exceptions for instructions that cause no other kind of precise 
exception 

Lurthermore, the following conditions may cause the C bit to be set: 

• The execution of an stwcx. instruction is allowed by the memory protection 
mechanism but a store operation is not performed. 

• The execution of an stswx instruction is allowed by the memory protection 
mechanism but a store operation is not performed because the specified length is 
zero. 

• A dcba or dcbi instruction is executed. 

No other cases cause the C bit to be set. 

7.5. 3. 3 Scenarios for Referenced and Changed Bit Recording 

This section provides a summary of the model (defined by the OEA) used by PowerPC 
processors that maintain the referenced and changed bits automatically in hardware, in the 
setting of the R and C bits. In some scenarios, the bits are guaranteed to be set by the 
processor; in some scenarios, the architecture allows that the bits may be set (not absolutely 
required); and in some scenarios, the bits are guaranteed to not be set. 

NOTE: When the hardware updates the R and C bits in memory, the accesses are 
performed as a physical memory access, as if the WIMG bit settings were 
ObOOlO (that is, as unguarded cacheable operations in which coherency is 
required). 

In implementations that do not maintain the R and C bits in hardware, software assistance 
is required. Lor these processors, the information in this section still applies, except that the 
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software performing the updates is constrained to the rules described (that is, must set bits 
shown as guaranteed to be set and must not set bits shown as guaranteed to not be set) 

NOTE: This software should be contained in the area of memory reserved for 

implementation-specific use and should be invisible to the operating system. 

Table 7-17 defines a prioritized list of the R and C bit settings for all scenarios. The entries 
in the table are prioritized from top to bottom, such that a matching scenario occurring 
closer to the top of the table takes precedence over a matching scenario closer to the bottom 
of the table. For example, if an stwcx. instruction causes a protection violation and there is 
no reservation, the C bit is not altered, as shown for the protection violation case. 

In the table, load operations include those generated by load instructions, by the eciwx 
instruction, and by the cache management instructions that are treated as loads with respect 
to address translation. Similarly, store operations include those operations generated by 
store instructions, by the ecowx instruction, and by the cache management instructions that 
are treated as stores with respect to address translation. 


Table 7-17. Model for Guaranteed R and C Bit Settings 


Priority 

Scenario 

Causes Setting 
of R Bit 

Causes Setting 
of C Bit 

1 

No-execute protection violation 

No 

No 

2 

Page protection violation 

Maybe 

No 

3 

Out-of-order instruction fetch or load operation 

Maybe 

No 

4 

Out-of-order store operation for instructions that will 
cause no other kind of precise exception (in the 
absence of system-caused, imprecise, or floating-point 
assist exceptions) 

Maybe 1 

Maybe 1 

5 

All other out-of-order store operations 

Maybe 1 

No 

6 

Zero-length load (Iswx) 

Maybe 

No 

7 

Zero-length store (stswx) 

Maybe 1 

Maybe 1 

8 

Store conditional (stwcx.) that does not store 

Maybe 1 

Maybe 1 

9 

In-order instruction fetch 

Yes 2 

No 

10 

Load instruction or eciwx 

Yes 

No 

11 

Store instruction, ecowx, dcbz, or dcba 3 instruction 

Yes 

Yes 

12 

icbi, debt, debtst, debst, or debt instruction 

Maybe 

No 

13 

debi instruction 

Maybe 1 

Maybe 1 


Notes: 

1 If C is set, R is guaranteed to also be set. 

2 This includes the case in which the instruction was fetched out of order and R was not set. 

3 For a dcba instruction that does not modify the target block, it is possible that neither bit is set. 
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7.5. 3. 4 Synchronization of Memory Accesses and Referenced and 
Changed Bit Updates 

Although the processor updates the referenced and changed bits in the page tables 
automatically, these updates are not guaranteed to be immediately visible to the program 
after the load, store, or instruction fetch operation that caused the update. If processor A 
executes a load or store or fetches an instruction, the following conditions are met with 
respect to performing the access and performing any R and C bit updates: 

• If processor A subsequently executes a sync instruction, both the updates to the bits 
in the page table and the load or store operation are guaranteed to be performed with 
respect to all processors and mechanisms before the sync instruction completes on 
processor A. 

• Additionally, if processor B executes a tlbie instruction that 

— signals the invalidation to the hardware, 

— invalidates the TLB entry for the access in processor A, and 

— is detected by processor A after processor A has begun the access, 

and processor B executes a tlbsync instruction after it executes the tlbie, both the 
updates to the bits and the original access are guaranteed to be performed with 
respect to all processors and mechanisms before the tlbsync instruction completes 
on processor A. 

7.5.4 Page Memory Protection 

In addition to the no-execute option that can be programmed at the segment descriptor level 
to prevent instructions from being fetched from a given segment (shown in Figure 7-4), 
there are a number of other memory protection options that can be programmed at the page 
level. The page memory protection mechanism allows selectively granting read access, 
granting read/write access, and prohibiting access to areas of memory based on a number 
of control criteria. 

The memory protection used by the block and page address translation mechanisms is 
different in that the page address translation protection defines a key bit that, in conjunction 
with the PP bits, determines whether supervisor and user programs can access a page. For 
specific information about block address translation, refer to Section 7.4.4, “Block 
Memory Protection.” 

For page address translation, the memory protection mechanism is controlled by the 
following: 

• MSR[PR], which defines the mode of the access as follows: 

— MSR[PR] = 0 corresponds to supervisor mode 

— MSR[PR] = 1 corresponds to user mode 

• Ks and Kp, the supervisor and user key bits, which define the key for the page 

• The PP bits, which define the access options for the page 
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The key bits (Ks and Kp) and the PP bits are located as follows for page address translation: 

• Ks and Kp are located in the segment descriptor. 

• The PP bits are located in the PTE. 

The key bits, the PP bits, and the MSR[PR] bit are used as follows: 

• When an access is generated, one of the key bits is selected to be the key as follows: 
— For supervisor accesses (MSRfPR] = 0), the Ks bit is used and Kp is ignored 
— For user accesses (MSR[PR] = 1), the Kp bit is used and Ks is ignored 

That is, key = (Kp & MSR[PR]) I (Ks & -MSR[PR]) 

• The selected key is used with the PP bits to determine if instruction fetching, load 
access, or store access is allowed. 

Table 7-18 shows the types of accesses that are allowed for the general case (all possible 
Ks, Kp, and PP bit combinations), assuming that the N bit in the segment descriptor is 
cleared (the no-execute option is not selected). 

Table 7-18. Access Protection Control with Key 


Key 1 

PP 2 

Page Type 

0 

00 

Read/write 

0 

01 

Read/write 

0 

10 

Read/write 

0 

11 

Read only 

1 

00 

No access 

1 

01 

Read only 

1 

10 

Read/write 

1 

11 

Read only 


Notes: 

1 Ks or Kp selected by state of MSR[PR] 

2 PP protection option bits in PTE 


Thus, the conditions that cause a protection violation (not including the no-execute 
protection option for instruction fetches) are depicted in Table 7-22 and as a flow diagram 
in Figure 7-17. 

Any access attempted (read or write) when the key = 1 and PP = 00, causes a protection 
violation exception condition. When key = 1 and PP = 01, an attempt to perform a write 
access causes a protection violation exception condition. When PP = 10, all accesses are 
allowed, and when PP = 1 1, write accesses always cause an exception. The processor takes 
either the ISI or the DSI exception (for an instruction or data access, respectively) when 
there is an attempt to violate the memory protection. 
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Table 7-19. Exception Conditions for Key and PP Combinations 


Key 

PP 

Prohibited 

Accesses 

0 

Ox 

None 

1 

00 

Read/write 

1 

01 

Write 

X 

10 

None 

X 

11 

Write 


Any combination of the Ks, Kp, and PP bits is allowed. One example is if the Ks and Kp 
bits are programmed so that the value of the key bit for Table 7-19 directly matches the 
MSRfPR] bit for the access. In this case, the encoding of Ks = 0 and Kp = 1 is used for the 
PTE, and the PP bits then enforce the protection options shown in Table 7-20. 


Table 7-20. Access Protection Encoding of PP Bits for Ks = 0 and Kp = 1 


PP 

Field 

Option 

User Read 
(Key = 1 ) 

User Write 
(Key = 1 ) 

Supervisor 
Read 
(Key = 0) 

Supervisor 
Write 
(Key = 0) 

00 

Supervisor-only 

Violation 

Violation 

y 

y 

01 

Supervisor-write-only 

y 

Violation 

y 

y 

10 

Both user/supervisor 

y 

y 

y 

y 

11 

Both read-only 

y 

Violation 

y 

Violation 


However, if the setting Ks = 1 is used, supervisor accesses are treated as user reads and 
writes with respect to Table 7-20. Likewise, if the setting Kp = 0 is used, user accesses to 
the page are treated as supervisor accesses in relation to Table 7-20. Therefore, by 
modifying one of the key bits (in the segment descriptor), the way the processor interprets 
accesses (supervisor or user) in a particular segment can easily be changed. Note, however, 
that only supervisor programs are allowed to modify the key bits for the segment descriptor. 
Access to the segment registers is privileged. 
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When the memory protection mechanism prohibits a reference, the flow of events is similar 
to that for a memory protection violation occurring with the block protection mechanism. 
As shown in Figure 7-15, one of the following occurs depending on the type of access that 
was attempted: 

• For data accesses, a DSI exception is generated and DSISR[4] is set. If the access is 
a store, DSISR[6] is also set. 

• For instruction accesses, 

— an ISI exception is generated and SRR1[4] is set, or 

— an ISI exception is generated and SRR1[3] is set if the segment is designated as 
no-execute. 

The only difference between the flow shown in Figure 7-15 and that of the block memory 
protection violation is the ISI exception that can be caused by an attempt to fetch an 
instruction from a segment that has been designated as no-execute (N bit set in the segment 
descriptor). See Chapter 6, “Exceptions,” for more information about these exceptions. 



Figure 7-15. Memory Protection Violation Flow for Pages 

If the page protection mechanism prohibits a store operation, the changed bit is not set (in 
either the TLB or in the page tables in memory); however, a prohibited store access may 
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cause a PTE to be loaded into the TLB and consequently cause the referenced bit to be set 
in a PTE (both in the TLB and in the page table in memory). 

7.5.5 Page Address Translation Summary 

Figure 7-16 provides the detailed flow for the page address translation mechanism which 
includes the checking of the N bit in the segment descriptor and then expands on the ‘TLB 
Hit’ branch of Figure 7-4. 

The detailed flow for the ‘TLB Miss’ branch of Figure 7-4 is described in Section 7.6.2, 
“Page Table Updates.” 

The checking of memory protection violation conditions for page address translation is 
shown in Figure 7-17. 

The ‘Invalidate TLB Entry’ box shown in Figure 7-16 is marked as implementation- 
specific as this level of detail for TLBs (and the existence of TLBs) is not dictated by the 
architecture. 

NOTE: Figure 7-16 does not show the detection of all exception conditions shown in 
Table 7-4 and Table 7-5; the flow for many of these exceptions is 
implementation- specific . 
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Note: Implementation-specific 


Figure 7-16. Page Address Translation Flow — TLB Hit 
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Check Page Memory 
Protection Violation 
Conditions 


Select Key: 

If MSR[PR] = 0, key = Ks 
If MSR[PR] = 1, key = Kp 


otherwise 



( Access Permitted key||PP = 

v ' 100 


Write Access with 
key || PP = any of: 
011 
100 
101 

Read Access with 111^\ 


+0 


( Access Prohibited ) (See Figure 7-15) 


Figure 7-17. Page Memory Protection Violation Conditions for Page Address 

Translation 


7.6 Hashed Page Tables 

If a copy of the PTE corresponding to the VPN for an access is not resident in a TLB 
(corresponding to a miss in the TLB, provided a TLB is implemented), the processor must 
search for the PTE in the page tables set up in main memory by the operating system. 

The only variables the operating system has available when defining the page table is the 
size of the table and its location in main memory. The latter has no influence on system 
performance. The former (size) will influence the number of PTEs in each group and thus 
determine the length of the serial search within a group before a match is found. 

The rule of thumb is to allocate a table of a size such that only one or two PTEs reside in a 
group. The hash value is defined by the architecture to be the XOR of the SID with the page 
index EA[4-19] and all PowerPC processor’s hardware use this algorithm. In real time 
while systems are running the only other method to influence the distribution of PTEs in a 
page table is the assignment of SIDs to program segments. Some operating systems 
actually allocate SIDs from pre-calculated tables and assign values to programs that 
optimize the randomness of hash products. This in turn generates a flatter distribution for 
PTEs in the page table. 

The page table search operation is performed by hardware or software. In either case real 
addressing mode is used as if MSR[DR]=0 and the M bit is set. 
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This section describes the format of the page tables and the algorithm used to access them. 
In addition, the constraints imposed on the software in updating the page tables (and other 
MMU resources) are described. 

7.6.1 Page Table Definition 

The hashed page table is a variable-sized data structure that defines the mapping between 
virtual page numbers and physical page numbers. The page table size is a power of 2, its 
starting address is a multiple of its size, and the table must reside in memory with the 
WIMG attributes of ObOOlO. 

The page table contains a number of page table entry groups (PTEGs). A PTEG contains 
eight PTEs of eight bytes each; therefore, each PTEG is 64 bytes long. PTEG addresses are 
entry points for table search operations. Figure 7-18 shows two PTEG addresses 
(PTEGaddrl and PTEGaddr2) where a given PTE may reside. 

Page Table 


PTEGaddrl 


PTEGaddr2 


— 

8 bytes 


PTEO 

PTE1 






PTE7 









PTEO 

PTE1 






PTE7 


PTEO 

PTE1 






PTE7 


















PTEGO 


PTEGn 


Figure 7-18. Page Table Definitions 

A given PTE can reside in one of two possible PTEGS — one is the primary PTEG and the 
other is the secondary PTEG. Additionally, a given PTE can reside in any of the PTE 
locations within an addressed PTEG. Thus, a given PTE may reside in one of 16 possible 
locations within the page table. If a given PTE is not in either the primary or secondary 
PTEG, a page table miss occurs, this is defined as a page fault condition. 

A table search operation is defined as the search for a PTE within a primary or secondary 
PTEG. When a table search operation commences, a primary hashing function is performed 
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on the virtual address. The output of the hashing function is then concatenated with bits 
stored in the SDR1 register by the operating system to create the physical address of the 
primary PTEG. The PTEs in the PTEG are then checked, one by one, to see if there is a hit 
within the PTEG. If the PTE is not located, a secondary hashing function is performed, a 
new physical address is generated for the PTEG, and the PTE is searched for again, using 
the secondary PTEG address. 

Note, however, that although a given PTE may reside in one of 16 possible locations, an 
address that is a primary PTEG address for some accesses also functions as a secondary 
PTEG address for a second set of accesses (as defined by the secondary hashing function). 
Therefore, these 16 possible locations are really shared by two different sets of effective 
addresses. Section 7. 6. 1.6, “Page Table Structure Examples,” illustrates how PTEs map 
into the 16 possible locations as primary and secondary PTEs. 

7.6. 1.1 SDR1 Register Definitions 

The SDR1 register contains the control information for the page table structure in that it 
defines the high-order bits for the physical base address of the page table and it defines the 
size of the table. 

NOTE: There are certain synchronization requirements for writing to SDR1 which are 
described in Section 2.3.17, “Synchronization Requirements for Special 
Registers and for Lookaside Buffers.” The format of the SDR1 register shown in 
the following sections. 

Figure 7-19 shows the SDR1 register layout and its bit settings are shown in Table 7-21. 

|~~1 Reserved 


HTABORG 


0000 000 


HTABMASK 


0 


15 16 


22 23 


31 


Figure 7-19. SDR1 Register Format 
Table 7-21. SDR1 Register Bit Settings 


Bits 

Name 

Description 

0-15 

HTABORG 

Physical base address of page table 

16-22 

— 

Reserved 

23-31 

HTABMASK 

Mask for page table address 


The HTABORG field in SDR1 contains the high-order 16 bits of the 32-bit physical address 
of the page table. Therefore, the beginning of the page table lies on a 2 16 byte (64 Kbyte) 
boundary at a minimum, the processor does not support 32 bits of physical address, 
software should write zeros to those unsupported bits in the HTABORG field (as the 
implementation treats them as reserved). Otherwise, a machine check exception can occur. 
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A page table can be any size 2” bytes where 16 <n< 25. The HTABMASK field in SDR1 
contains a mask value that determines how many bits from the output of the hashing 
function are used as the page table index. This mask must be of the form 0b00...011...1 (a 
string of 0 bits followed by a string of 1 bits). As the table size increases, more bits are used 
from the output of the hashing function to index into the table. The 1 bits in HTABMASK 
determine how many additional bits (beyond the minimum of 10) from the hash are used in 
the index; the HTABORG field must have the same number of lower-order bits equal to 0 
as the HTABMASK field has lower-order bits equal to 1. 

Example: 

Suppose that the page table is 16,384 (2 14 ) 128-byte PTEGs, for a total size of 2 21 bytes 
(2 Mbytes). A 14-bit index is required. Eleven bits are provided from the hash to start with, 
so 3 additional bits from the hash must be selected. Thus the value in HTABMASK must 
be 7 (3 binary l’s) and the value in HTABORG must have its low-order 3 bits (SDR1[13- 
15]) equal to 0. This means that the page table must begin on a 2 <3 + 11 + 7> = 2 21 = 2- 
Mbyte boundary. 

7. 6. 1.2 Page Table Size 

The ratio between the number of entries in the page table and the page table capacity 
directly affects performance because it influences the hit probability and search time in the 
PTEG in the page table. If the table is too small, too many PTEs may be resident in each 
PTEG. This increases the serial search time within a group. In some cases all 16 entries 
could be utilized. This would cause unnecessary page thrashing. The minimum size for a 
page table is 64 Kbytes (2 10 PTEGs of 64 bytes each). The reason for this is that the 10 low- 
order bits of the page index are not stored in the PTE. However, it is recommended that the 
total number of PTEGs in the page table be at least half the number of physical page frames 
to be mapped. This yields an average of 2 PTEs in a PTEG or a 25% utilization of the page 
table. While avoidance of hash collisions cannot be guaranteed for any size page table, 
making the page table larger than the recommended minimum size reduces the frequency 
of such collisions by making the primary PTEGs more sparsely populated, and further 
reducing the need to use the secondary PTEGs. Ideally, the best performance is realized 
where there is one PTEG for each physical page and there is a completely flat distribution 
of the hashing function. Then the hash pointer yields a hit every time and no serial search 
of the PTEG is necessary. A table of this size would have a 12.5% utilization of PTEs in the 
page table. 

Table 7-22 shows some example sizes for total main memory. The recommended minimum 
page table size for these example memory sizes are then outlined, along with their 
corresponding HTABORG and HTABMASK settings in SDR1. 
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NOTE: Systems with less than 8 Mbytes of main memory may be designed with some 
processors, but the minimum amount of memory that can be used for the page 
tables in these cases is 64 Kbytes. 


Table 7-22. Minimum Recommended Page Table Sizes 


Total Main Memory 

Recommended Minimum 

Settings for Recommended 
Minimum 

Memory for Page 
Tables 

Number of 
Mapped 
Pages (PTEs) 

Number of 
PTEGs 

HTABORG 
(Maskable 
Bits 7-15) 

HTABMASK 

8 Mbytes (2 23 ) 

64 Kbytes (2 16 ) 

2 13 

2 io 

X xxxx xxxx 

0 0000 0000 

16 Mbytes (2 24 ) 

128 Kbytes (2 17 ) 

2 14 

2 1 1 

x xxxx xxxO 

0 0000 0001 

32 Mbytes (2 25 ) 

256 Kbytes (2 18 ) 

2 15 

2 1 2 

x xxxx xxOO 

0 0000 0011 

64 Mbytes (2 26 ) 

512 Kbytes (2 19 ) 

2 16 

2 13 

x xxxx xOOO 

0 0000 0111 

128 Mbytes (2 27 ) 

1 Mbyte (2 20 ) 

2 17 

2 14 

x xxxx 0000 

0 0000 1111 

256 Mbytes (2 28 ) 

2 Mbytes (2 21 ) 

2 18 

2 15 

x xxxO 0000 

0 0001 1111 

512 Mbytes (2 29 ) 

4 Mbytes (2 22 ) 

2 19 

2 16 

x xxOO 0000 

0 0011 1111 

1 Gbytes (2 30 ) 

8 Mbytes (2 23 ) 

2 20 

2 17 

x xOOO 0000 

0 0111 1111 

2 Gbytes (2 31 ) 

1 6 Mbytes (2 24 ) 

2 21 

2 18 

x 0000 0000 

0 1111 1111 

4 Gbytes (2 32 ) 

32 Mbytes (2 25 ) 

2 22 

2 19 

0 0000 0000 

1 1111 1111 


As an example, if the physical memory size is 2 29 bytes (512 Mbyte), then there are 
2 29 - 2 12 (4 Kbyte page size) = 2 17 (128 Kbyte) total page frames. If this number of page 
frames is divided by 2, the resultant minimum recommended page table size is 2 16 PTEGs, 
or 2 22 bytes (4 Mbytes) of memory for the page tables. 

7.6.1 .3 Page Table Hashing Functions 

The MMU uses two different hashing functions, a primary and a secondary, in the creation 
of the physical addresses used in a page table search operation. These hashing functions 
distribute the PTEs within the page table, in that there are two possible PTEGs where a 
given PTE can reside. Additionally, there are eight possible PTE locations within a PTEG 
where a given PTE can reside. If a PTE is not found using the primary hashing function, 
the secondary hashing function is performed, and the secondary PTEG is searched. 

NOTE: These two functions must also be used by the operating system to set up the page 
tables in memory appropriately. 

Typically, the hashing functions provide a high probability that a required PTE is resident 
in the page table, without requiring the definition of all possible PTEs in main memory. 
However, if a PTE is not found in the secondary PTEG, a page fault occurs and an exception 
is taken. Thus, the required PTE can then be placed into either the primary or secondary 
PTEG by the system software, and on the next TLB miss to this page (in those processors 
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that implement a TLB), the PTE will be found in the page tables (and loaded into an on- 
chip TLB). 

The address of a PTEG is derived from the HTABORG field of the SDR1 register, and the 
output of the corresponding hashing function (primary hashing function for primary PTEG 
and secondary hashing function for a secondary PTEG). The value in the determines how 
many of the higher-order hash value bits are masked and how many are used in the 
generation of the physical address of the PTEG. 

Figure 7-20 depicts the hashing functions defined by the PowerPC OEA. The inputs to the 
primary hashing function are the lower-order 19 bits of the VSID field of the selected 
segment register (bits 5-23 of the 52-bit virtual address), and the page index field of the 
effective address (bits 24-39 of the virtual address) concatenated with three zero higher- 
order bits. The XOR of these two values generates the output of the primary hashing 
function (hash value 1). 

When the secondary hashing function is required, the output of the primary hashing 
function is one’s complemented, to provide hash value 2 

Primary Hash: 

VA5 VA23 

Lower-Order 19 Bits of VSID (from Segment Register) 


XOR 


VA24 


VA39 


000 


Page Index (from Effective Address(4-19)) 


Output of Hashing Function 1 


0 

L_ 


8^9 


18 


Secondary Hash: 
0 


8^9 


18 



18 


Hash Value 1 


Hash Value 2 


Figure 7-20. Hashing Functions for Page Tables 
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7.6.1 .4 Page Table Addresses 

The following sections illustrate hash address generation and table structures for the page 
table and the SDR1 register that locates and defines the size of the page table. 

Two of the elements that define the virtual address (the VSID field of the segment descriptor 
and the page index field of the effective address) are used as inputs into a hashing function. 
Depending on whether the primary or secondary PTEG is to be accessed, the processor uses 
either the primary or secondary hashing function as described in Section 7. 6. 1.3, “Page 
Table Hashing Functions.” 

NOTE: Unless all accesses to be performed by the processor can be translated by the 
BAT mechanism when address translation is enabled (MSRfDR] or 
MSR[IR] = 1), the SDR1 must point to a valid page table; otherwise, a machine 
check exception can occur. 

Additionally, care should be given that page table location in memory not conflict with 
other reserved areas in memory. Such as the exception vector programs or tables, memory 
mapped I/O areas, or other implementation-specific areas (refer to Section 7. 2. 1.1, 
“Predefined Physical Memory Locations”). The base address of the page table is defined by 
the 16 high-order bits of SDR1. (i.e. HTABORG). 

When a TLB miss occurs, a PTEG address is generated as follows: The high-order 7 bits 
are taken directly from the corresponding bits of SDR1 . The low-order 6 bits are set to zero. 
A hash value (hopefully a random number with a flat distribution) is generated by an XOR 
of VA[5-23] and 3 zeros concatenated to VA[24-39] yielding a 19 bit value. Depending 
upon the page table size at least 10 and at most 19 bits are passed forward. The number of 
bits selected is controlled by the HTABMASK bits of the SDR1 register. This mask is a 9 
bit value and is ANDed with the 9 high-order bits of the hash value. The results of this 
boolean operation is passed forward and ORed with the low-order 9 bits of the HTABORG 
(i.e. SRD1 [7-15]). The output of these two boolean operations become the nine address bits 
PTEG[7-15] of the PTEG address. Address bits PTEG[16-25] are taken directly from the 
10 low-order bits of the hash function. 

Figure 7-21 provides a graphical description of the generation of the PTEG address. 
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Figure 7-21. Generation of Addresses for Page Tables 
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7.6. 1.5 Page Table Structure Summary 

In the process of searching for a PTE, the processor interprets the values read from memory 
as described in Section 7. 5.2. 2, “Page Table Entry (PTE) Definitions.” The VSID and the 
abbreviated page index (API) fields of the virtual address of the access are compared to 
those same fields of the PTEs in memory. In addition, the valid (V) bit and the hashing 
function (H) bit are also checked. For a hit to occur, the V bit of the PTE in memory must 

be set. If the fields match and the entry is valid, the PTE is considered a hit if the H bit is 

set as follows: 

• If this is the primary PTEG, H = 0 

• If this is the secondary PTEG, H = 1 

The physical address of the PTE(s) to be checked is derived as shown in Figure 7-22 and 
Figure 7-23, and the generated address is the address of a group of eight PTEs (a PTEG). 
During a table search operation, the processor compares up to 16 PTEs: PTE0-PTE7 of the 
primary PTEG (defined by the primary hashing function) and PTE0-PTE7 of the secondary 
PTEG (defined by the secondary hashing function). 

If the VSID and API fields do not match or V and H are not set appropriately for any of 
these PTEs, a page fault occurs and an exception is taken.The page in question is considered 
as nonresident (page fault) and the operating system must load the page into main memory 
and update the page table accordingly. If a valid PTE is located in the page table, the page 
is considered resident and the TLB can be loaded. 

The architecture does not specify the order in which the PTEs are checked. 

NOTE: For maximum performance however, PTEs should be allocated by the operating 
system first beginning with the PTEO location within the primary PTEG, then 
PTE1, and so on. If more than eight PTEs are required within the address space 
that defines a PTEG address, the secondary PTEG can be used (again, allocation 
of PTEO of the secondary PTEG first, and so on is recommended). Additionally, 
it may be desirable to place the PTEs that will require most frequent access at the 
beginning of a PTEG and reserve the PTEs in the secondary PTEG for the least 
frequently accessed PTEs. 

The architecture also allows for multiple matching entries to be found within a table search 
operation. Multiple matching PTEs are allowed if they meet the match criteria described 
above, as well as have identical RPN, WIMG, and PP values, allowing for differences in 
the R and C bits. In this case, one of the matching PTEs is used and the R and C bits are 
updated according to this PTE. In the case that multiple PTEs are found that meet the match 
criteria but differ in the RPN, WIMG or PP fields, the translation is undefined and the 
resultant R and C bits in the matching entries are also undefined. 

NOTE: Multiple matching entries can also differ in the setting of the H bit, but the H bit 
must be set according to whether the PTE was located in the primary or 
secondary PTEG, as described above. 
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7.6.1 .6 Page Table Structure Example 

Figure 7-22 shows the structure of an example page table. The base address of the page 
table is defined by SDRl[HTABORG] concatenated with 16 zero bits. In this example, the 
address is identified by bits 0-13 in SDRl[HTABORG]; note that bits 14 and 15 of 
HTABORG must be zero because the lower-order two bits of HTABMASK are ones. The 
addresses for individual PTEGs within this page table are then defined by bits 14-25 as an 
offset from bits 0-13 of this base address. Thus, the size of the page table is defined as 4096 
PTEGs. 


HTABORG 


Example: 


Given: SDR1 


15 


HTABMASK 
23 31 


1010 0110 0000 0000 0000 0000 0000 0011 


Base Address 


Page Table 


$A600 0000 


PTEGaddrl 


PTEGaddr2 


PTE0 

PTE1 






PTE7 









PTE0 

PTE1 






PTE7 


PTE0 

PTE1 






PTE7 


















PTEG0 


PTEG4095 



0 



14 



25 

31 

PTEGaddrl = 

1010 

0110 

0000 

1 

00mm 

aaaa 

aaaa 

aaOO 

0000 


0 



14 



25 

31 

PTEGaddr2 = 

1010 

0110 

0000 

1 

OOnn 

bbbb 

bbbb 

bbOO 

0000 


Figure 7-22. Example Page Table Structure 
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Two example PTEG addresses are shown in the Figure 7-22 as PTEGaddrl and 
PTEGaddr2. Bits 14-25 of each PTEG address in this example page table are derived from 
the output of the hashing function (bits 26-31 are zero to start with PTEO of the PTEG). In 
this example, the ‘b’ bits in PTEGaddr2 are the one’s complement of the ‘a’ bits in 
PTEGaddrl. The ‘n’ bits are also the one’s complement of the ‘m’ bits, but these two bits 
are generated from bits 7-8 of the output of the hashing function, logically ORed with bits 
14-15 of the HTABORG field (which must be zero). If bits 14-25 of PTEGaddrl were 
derived by using the primary hashing function, then PTEGaddr2 corresponds to the 
secondary PTEG. 

Note, however, that bits 14-25 in PTEGaddr2 can also be derived from a combination of 
effective address bits, segment register bits, and the primary hashing function. In this case, 
then PTEGaddrl corresponds to the secondary PTEG. Thus, while a PTEG may be 
considered a primary PTEG for some effective addresses (and segment register bits), it may 
also correspond to the secondary PTEG for a different effective address (and segment 
register value). 

It is the value of the H bit in each of the individual PTEs that identifies a particular PTE as 
either primary or secondary (there may be PTEs that correspond to a primary PTEG and 
PTEs that correspond to a secondary PTEG, all within the same physical PTEG address 
space). Thus, only the PTEs that have H = 0 are checked for a hit during a primary PTEG 
search. Likewise, only PTEs with H = 1 are checked in the case of a secondary PTEG 
search. 

7.6. 1.7 PTEG Address Mapping Examples 

This section contains an example of an effective address and how its address translation (the 
PTE) maps into the primary PTEG in physical memory. The example illustrates how the 
processor generates PTEG addresses for a table search operation; this is also the algorithm 
that must be used by the operating system when placing page table entries into the page 
table. 

Figure 7-23 shows an example of PTEG address generation. In the example, the value in 
SDR1 defines a page table at address 0x0F98_0000 that contains 8192 PTEGs. The 
example effective address selects segment register 0 (SRO) using the high order four bits. 
The contents of SRO are then used along with bits 4-3 1 of the effective address to create 
the 52-bit virtual address. 

To generate the address of the primary PTEG, bits 5-23, and bits 24-39 of the virtual 
address are then used as inputs into the primary hashing function (XOR) to generate hash 
value 1. The low-order 13 bits of hash value 1 are then concatenated with the high-order 13 
bits of HTABORG and with six low-order 0 bits, defining the address of the primary PTEG 
(0x0F9F_F980). 
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HTABMASK 


Example: 


Given: SDR1 
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31 
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0000 
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0000 

0000 
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0000! 

! 0000 

1111 

1111 
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! 0000 

0001 
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Segment Register Select 




OxC 
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0 1 C 


— ► SR0 

0010 0000 1100 

1010 0111 0000 0001 1100 




8 

i 


31 

i 


Virtual Address: 

VSID 

Page Index , 




i r 

1100 1010 0111 0000 0001 1100 0000 1111 1111 1010 0000 0001 1011 



x’ 0 


8 0 ’ 


Figure 7-23. Example Primary PTEG Address Generation 

Figure 7-24 shows the generation of the secondary PTEG address for this example. If the 
secondary PTEG is required, the secondary hash function is performed and the high-order 
9 bits of secondary hash (one’s complement of primary hash results) are ANDed with the 
HTABMASK and then ORed with the low-order 9 bits of HTABORG (bits 13-15 of 
HTABORG must be zero), and concatenated with six low-order 0 bits. These bits are 
concatenated with HTABORG[0-6] to form the address of the secondary PTEG 
(0x0F98_0640). 
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As described in Figure 7-22, the 10 low-order bits of the page index field are always used 
in the generation of a PTEG address (through the hashing function). This is why only the 
abbreviated page index (API) is defined for a PTE (the entire page index field does not need 
to be checked). For a given effective address, the low-order 10 bits of the page index (at 
least) contribute to the PTEG address (both primary and secondary) where the 
corresponding PTE may reside in memory. Therefore, if the high-order 6 bits (the API field) 
of the page index match with the API field of a PTE within the specified PTEG, the PTE 
mapping is guaranteed to be the unique PTE required. 


Hash Value 1 : 


010 0111 1111 1110 0110 



PTEG0 

PTEG25 

PTEG81 66 

PTEG8191 


Figure 7-24. Example Secondary PTEG Address Generation 

NOTE: A given PTEG address does not map back to a unique effective address. Not only 
can a given PTEG be considered both a primary and a secondary PTEG (as 
described in Section 7. 6. 1.6, “Page Table Structure Examples”), but in this 
example, bits 24-26 of the page index field of the virtual address are not used to 
generate the PTEG address. Therefore, any of the eight combinations of these 
bits will map to the same primary PTEG address. (However, these bits are part 
of the API and are therefore compared for each PTE within the PTEG to 
determine if there is a hit.) Furthermore, an effective address can select a 
different segment register with a different value such that the output of the 
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primary (or secondary) hashing function happens to equal the hash values shown 
in the example. Thus, these effective addresses would also map to the same 
PTEG addresses shown. 


7.6.2 Page Table Search Process 

An outline of the page table search process is as follows: 

1 . The 32-bit physical addresses of the primary and secondary PTEGs are generated as 
described in the Section 7.6. 1.7, “PTEG Address Mapping Examples.” 

2. As many as 16 PTEs (from the primary and secondary PTEGs) are read from 
memory. (The architecture does not specify the order of these reads, allowing 
multiple reads to occur in parallel.) 

PTE reads occur with an implied WIM memory/cache mode control bit setting of 
ObOOl; therefore, they are considered cacheable. 

3. The PTEs in the selected PTEGs are tested for a match with the virtual page number 
(VPN) of the access. (The VPN is the VSID concatenated with the page index field 
of the effective address.) 

For a match to occur, the following must be true: 

— PTE [H] = 0 for primary PTEG; PTE[H] = 1 for secondary PTEG 
— PTE [V] = 1 
— PTE [VSID] = VA [0-23] 

— PTE [API] = VA [24-29] 

4. If a match is not found within the eight PTEs of the primary PTEG and the eight 
PTEs of the secondary PTEG, an exception is generated as described in step 8. 

If a match (or multiple matches) is found, the table search process continues. 

5. If multiple matches are found, all of the following must be true: 

— PTE [RPN] is equal for all matching entries 

— PTE [WIMG] is equal for all matching entries 
— PTE [PP] is equal for all matching entries 

6. If one of the fields in step 5 does not match, the translation is undefined, and R and 
C bit of matching entries are undefined. Otherwise, the R and C bits are updated 
based on one of the matching entries. 

7. A copy of the PTE is written into the on-chip TLB (if implemented) and the R bit is 
updated in the PTE in memory (if necessary). If there is no memory protection 
violation, the C bit is also updated in memory (if necessary) and the table search is 
complete. 

8. If a match is not found within the primary or secondary PTEG, the search fails, and 
a page fault exception condition occurs (either an ISI or DSI exception). 

Reads from memory for page table search operations are performed (that is, as unguarded 
cacheable operations in which coherency is required). 
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7.6.2. 1 Flow for Page Table Search Operation 

Figure 7-25 provides a detailed flow diagram of a page table search operation. Note that the 
references to TLBs are shown as optional because TLBs are not required; if they do exist, 
the specifics of how they are maintained are implementation- specific. Also, Figure 7-25 
shows only a few cases of R-bit and C-bit updates. For a complete list of the R- and C-bit 
updates dictated by the architecture, refer to Table 7-17. 



Figure 7-25. Page Table Search Flow 
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7.6.3 Page Table Updates 

This section describes the requirements on the software when updating page tables in 
memory via some pseudocode examples. Multiprocessor systems must follow the rules 
described in this section so that all processors operate with a consistent set of page tables. 
Even single processor systems must follow certain rules, because software changes must be 
synchronized with the other instructions in execution and with automatic updates that may 
be made by the hardware (referenced and changed bit updates). 

Updates to the tables include the following operations: 

• Adding a PTE 

• Modifying a PTE, including modifying the R and C bits of a PTE 

• Deleting a PTE 

PTEs must be locked on multiprocessor systems. Access to PTEs must be appropriately 
synchronized by software locking of (that is, guaranteeing exclusive access to) PTEs or 
PTEGs if more than one processor can modify the table at that time. In the examples below, 
software locks should be performed to provide exclusive access to the PTE being updated. 
However, the architecture does not dictate the specific protocol to be used for locking (for 
example, a single lock, a lock per PTEG, or a lock per PTE can be used). See Appendix E, 
“Synchronization Programming Examples,” for more information about the use of the 
reservation instructions (such as the lwarx and stwcx. instructions) to perform software 
locking. 

When TLBs are implemented they are defined as noncoherent caches of the page tables. 
TLB entries must be invalidated explicitly with the TLB invalidate entry instruction (tlbie) 
whenever the corresponding PTE is modified. In a multiprocessor system, the tlbie 
instruction must be controlled by software locking, so that the tlbie is issued on only one 
processor at a time. 

The PowerPC OEA defines the tlbsync instruction that ensures that TLB invalidate 
operations executed by this processor have caused all appropriate actions in other 
processors. In a system that contains multiple processors, the tlbsync functionality must be 
used in order to ensure proper synchronization with the other PowerPC processors. 

NOTE: A sync instruction must also follow the tlbsync to ensure that the tlbsync has 
completed execution on this processor. 

On single processor systems, PTEs need not be locked and the eieio instructions (in 
between the tlbie and tlbsync instructions) and the tlbsync instructions themselves are not 
required. The sync instructions shown are required even for single processor systems (to 
ensure that all previous changes to the page tables and all preceding tlbie instructions have 
completed). 

Any processor, including the processor modifying the page table, may access the page table 
at any time in an attempt to reload a TLB entry. An inconsistent PTE must never 
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accidentally become visible (if V = 1); thus, there must be synchronization between 
modifications to the valid bit and any other modifications (to avoid corrupted data). 

In the pseudocode examples that follow, changes made to a PTEshown as a single line in 
the example is assumed to be performed with an atomic store instruction. Appropriate 
modifications must be made to these examples if this assumption is not satisfied. 

Updates of R and C bits by the processor are not synchronized with the accesses that cause 
the updates. When modifying the low-order half of a PTE, software must take care to avoid 
overwriting a processor update of these bits and to avoid having the value written by a store 
instruction overwritten by a processor update. The processor does not alter any other fields 
of the PTE. 

Explicitly altering certain MSR bits (using the mtmsr instruction), or explicitly altering 
PTEs, or certain system registers, may have the side effect of changing the effective or 
physical addresses from which the current instruction stream is being fetched. This kind of 
side effect is defined as an implicit branch. Therefore, PTEs must not be changed in a 
manner that causes an implicit branch. Section 2.3.17, “Synchronization Requirements for 
Special Registers and for Lookaside Buffers,” lists the possible implicit branch conditions 
that can occur when system registers and MSR bits are changed. 

For a complete list of the synchronization requirements for executing the MMU 
instructions, see Section 2.3.17, “Synchronization Requirements for Special Registers and 
for Lookaside Buffers.” 

The following examples show the required sequence of operations. However, other 
instructions may be interleaved within the sequences shown. 

7.6.3. 1 Adding a Page Table Entry 

Adding a page table entry requires only a lock on the PTE in a multiprocessor system. The 
first bytes in the PTE are then written (this example assumes the old valid bit was cleared), 
the eieio instruction orders the update, and then the second update can be made. A sync 
instruction ensures that the updates have been made to memory. 

lock(PTE) 

PTE[RPN,R,C,WIMG,PP] <- new values 
eieio /* order 1 st PTE update before 2nd 

PTE[VSID,H,API,V] v- new values (V = 1) 
sync /* ensure updates completed 

unlock(PTE) 
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7. 6. 3. 2 Modifying a Page Table Entry 

The following sections describe several scenarios for modifying a PTE. 


7. 6. 3. 2.1 General Case 

Consider the general case where a currently-valid PTE must be changed. 
To do this, the PTE: 

• Must be locked 

• Marked invalid 

• Updated 

• Invalidated from the TLB 

• Marked valid again, and 

• Unlocked. 


The sync instruction must be used at appropriate times to wait for modifications to 
complete. 

NOTE: The tlbsync and the sync instruction that follows it are only required if software 
consistency must be maintained with other PowerPC processors in a 
multiprocessor system (and the software is to be used in a multiprocessor 
environment). 


The following pseudo-code shows the steps for a general case: 
lock(PTE) 

PTE[V] <— 0 /* (other fields don’t matter) 

sync /* ensure update completed 

PTE[RPN,R,C,WIMG,PP] <- new values 
tlbie(old_EA) ^invalidate old translation 

eieio /* order before tlbsync and order 2nd PTE update before 3rd 

PTE[VSID,H,API, V] <- new values (V = 1) 

tlbsync /* ensure tlbie completed on all processors 

sync /* ensure tlbsync and last update completed 

unlock(PTE) 

7. 6. 3. 2. 2 Clearing the Referenced (R) Bit 

When the PTE is modified only to clear the R bit to 0, a much simpler algorithm suffices 
because the R bit need not be maintained exactly. The pseudo-code for this case: 

lock(PTE) 

oldR <-PTE[R] /*get old R 

if oldR = 1 , then 


PTE[R] <- 0 

tlbie(PTE) 

eieio 

tlbsync 

sync 

unlock(PTE) 


/* store byte (R = 0, other bits unchanged) 

/* invalidate entry 

/* order tlbie before tlbsync 

/* ensure tlbie completed on all processors 

/* ensure tlbsync and update completed 
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Since only the R and C bits are modified by the processor, and since they reside in different 
bytes, the R bit can be cleared by reading the current contents of the byte in the PTE 
containing R (bits 16-23 of the second word), ANDing the value with OxFE, and storing 
the byte back into the PTE. 

7. 6. 3. 2. 3 Modifying the Virtual Address 

If the virtual address is being changed to a different address within the same hash class 
(primary or secondary), the following flow suffices: 


lock(PTE) 

PTE[VSID,API,H,V] v- new values (V = 1) 


sync 

tlbie(old_EA) 

eieio 

tlbsync 

sync 

unlock(PTE) 


/* ensure update completed 

/* invalidate old translation 

/* order tlbie before tlbsync 

/* ensure tlbie completed on all processors 

/* ensure tlbsync completed 


In this pseudocode flow, the tlbsync and the sync instruction that follows it are only 
required if consistency must be maintained with other PowerPC processors in a 
multiprocessor system (and the software is to be used in a multiprocessor environment). 

In this example, if the new address is not a cache synonym (alias) of the old address, care 
must be taken to also flush (or invalidate) from an on-chip cache any cache synonyms for 
the page. Thus, a temporary virtual address that is a cache synonym with the page whose 
PTE is being modified can be assigned and then used for the cache flushing (or 
invalidation). 


To modify the WIMG or PP bits without overwriting an R or C bit update being performed 
by the processor, a sequence similar to the one shown above can be used, except that the 
second line is replaced by a loop containing an lwarx/stwcx. instruction pair that emulates 
an atomic compare and swap of the low-order word of the PTE. 

7.6. 3. 3 Deleting a Page Table Entry 

In this example, the entry is locked, marked invalid, invalidated in the TLB, and unlocked. 
Again, note that the tlbsync and the sync instruction that follows it are only required if 
consistency must be maintained with other PowerPC processors in a multiprocessor system 
(and the software is to be used in a multiprocessor environment). 

lock(PTE) 

PTE[V] v- 0 
sync 

tlbie(old_EA) 
eieio 
tlbsync 
sync 

unlock(PTE) 


/* (other fields don’t matter) 

/* ensure update completed 

/* invalidate old translation 

/* order tlbie before tlbsync 

/* ensure tlbie completed on all processors 

/* ensure tlbsync completed 
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7.6.4 Segment Register Updates 

Synchronization requirements for using the move to segment register instructions are 
described in Section 2.3. 1.7, “Synchronization Requirements for Special Registers and for 
Lookaside Buffers.” 

7.7 Direct-Store Segment Address Translation 

As described for memory segments, all accesses generated by the processor (with 
translation enabled) that do not map to a BAT area, map to a segment descriptor. If T = 1 
for the selected segment descriptor, the access maps to the direct-store interface, invoking 
a specific bus protocol for accessing I/O devices. 

Direct-store segments are provided for POWER compatibility. As the direct-store interface 
is present only for compatibility with existing I/O devices that used this interface and the 
direct-store interface protocol is not optimized for performance, its use is discouraged. 
Additionally, the direct-store facility is being phased out of the architecture. This 
functionality is considered optional (to allow for those earlier devices that implemented it). 
However, future devices are not likely to support it. Thus, software should not depend on 
its results and new software should not use it. Applications that require low-latency 
load/store access to external address space should use memory-mapped I/O, rather than the 
direct-store interface. 

7.7.1 Segment Descriptors for Direct-Store Segments 

The format of the fields in the segment descriptors depends on the value of the T bit. The 
segment descriptors reside in one of 16 segment registers. 

Figure 7-26 shows the register format for the segment registers when the T bit is set. 


D 



BUID 

CNTLR_SPEC 


0123 11 12 31 

Figure 7-26. Segment Register Format for Direct-Store Segments 


Table 7-23 shows the bit definitions for the segment registers when the T bit is set. 

Table 7-23. Segment Register Bit Definitions for Direct-Store Segments 


Bit 

Name 

Description 

0 

T 

T = 1 selects this format. 

1 

Ks 

Supervisor-state protection key 

2 

Kp 

User-state protection key 

3-11 

BUID 

Bus unit ID 

12-31 

CNTLR_SPEC 

Device-specific data for I/O controller 
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7.7.2 Direct-Store Segment Accesses 

When the address translation process determines that the segment descriptor has T = 1, 
direct-store segment address translation is selected; no reference is made to the page tables 
and neither the referenced or changed bits are updated. These accesses are performed as if 
the WIMG bits were ObOlOl; that is, caching is inhibited, the accesses bypass the cache, 
hardware-enforced coherency is not required, and the accesses are considered guarded. 

The specific protocol invoked to perform these accesses involves the transfer of address and 
data information; however, the PowerPC OEA does not define the exact hardware protocol 
used for direct-store accesses. Some instructions may cause multiple address/data 
transactions to occur on the bus. In this case, the address for each transaction is handled 
individually with respect to the MMU. 

The following describes the data that is typically sent to the memory controller by 
processors that implement the direct-store function: 

• One of the Kr bits (Ks or Kp) is selected to be the key as follows: 

— For supervisor accesses (MSRfPR] = 0), the Ks bit is used and Kp is ignored. 
— For user accesses (MSRfPR] = 1), the Kp bit is used and Ks is ignored. 

• An implementation-dependent portion of the segment descriptor. 

• An implementation-dependent portion of the effective address. 

7.7.3 Direct-Store Segment Protection 

Page-level memory protection as described in Section 7.5.4, “Page Memory Protection,” is 
not provided for direct-store segments. The appropriate key bit (Ks or Kp) from the 
segment descriptor is sent to the memory controller, and the memory controller implements 
any protection required. Frequently, no such mechanism is provided; the fact that a direct- 
store segment is mapped into the address space of a process may be regarded as sufficient 
authority to access the segment. 

7.7.4 Instructions Not Supported in Direct-Store Segments 

The following instructions are not supported at all and cause either a DSI exception or 
boundedly-undefined results when issued with an effective address that selects a segment 
descriptor that has T = 1 : 

• lwarx 

• stwcx. 

• eciwx 

• ecowx 
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7.7.5 Instructions with No Effect in Direct-Store Segments 

The following instructions are executed as no-ops when issued with an effective address 
that selects a segment where T = 1 : 

• dcba 

• debt 

• debtst 

• debf 

• debi 

• debst 

• debz 

• iebi 

7.7.6 Direct-Store Segment Translation Summary Flow 

Figure 7-26 shows the flow used by the MMU when direct-store segment address 
translation is selected. Figure 7-26 expands the Direct-Store Segment Translation stub 
found in Figure 7-4 for both instruction and data accesses. In the case of a floating-point 
load or store operation to a direct-store segment, it is implementation- specific whether the 
alignment exception occurs. In the case of an eciwx, ecowx, lwarx, or stwex. instruction, 
the implementation either sets the DSISR as shown and causes the DSI exception, or causes 
boundedly-undefined results. 
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Figure 7-27. Direct-Store Segment Translation Flow 
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Chapter 8. Instruction Set 

This chapter lists the PowerPC instruction set in alphabetical order by mnemonic. Each 
entry includes the instruction formats and a quick reference 'legend’ that provides such 
information as the level(s) of the PowerPC architecture in which the instruction may be 

found — user instruction set architecture (UISA), virtual environment architecture (VEA), 

and operating environment architecture (OEA); and the privilege level of the U 
instruction — user- or supervisor-level (an instruction is assumed to be user-level unless the V 
legend specifies that it is supervisor- level); and the instruction formats. q 

The format diagrams show, horizontally, all valid combinations of instruction fields; for a 
graphical representation of these instruction formats, see Appendix A, "PowerPC 
Instructions Set Listings.” A description of the instruction fields and pseudocode 
conventions are also provided. 

For more information on the PowerPC instruction set, refer to Chapter 4, “Addressing 
Modes and Instruction Set Summary.” 

NOTE: The architecture specification refers to user-level and supervisor-level as 
problem state and privileged state, respectively. 


8.1 Instruction Formats 


Instructions are four bytes long and word-aligned, so when instruction addresses are 
presented to the processor (as in branch instructions) the two low-order bits are ignored. 
Similarly, whenever the processor develops an instruction address, its two low-order bits 
are zero. 



Bits 0-5 always specify the primary opcode. Many instructions also have an extended 
opcode. The remaining bits of the instruction contain one or more fields for the different 
instruction formats. 


Some instruction fields are reserved or must contain a predefined value as shown in the 
individual instruction layouts. If a reserved field does not have all bits cleared, or if a field 
that must contain a particular value does not contain that value, the instruction form is 
invalid and the results are as described in Chapter 4, “Addressing Modes and Instruction set 
Summary.” 

Within the instruction format diagram the instruction operation code and extended 
operation code (if extended form) are specified in decimal. These fields have been 
converted to hexadecimal and are shown on line two for each instruction definition. 
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8.1.1 Split-Field Notation 

Some instruction fields occupy more than one contiguous sequence of bits or occupy a 
contiguous sequence of bits used in permuted order. Such a field is called a split field. Split 
fields that represent the concatenation of the sequences from left to right are shown in 
lowercase letters. These split fields — spr, and tbr — are described in Table 8-1. 


Table 8-1. Split-Field Notation and Conventions 


Field 

Description 

spr (11-20) 

This field is used to specify a special-purpose register for the mtspr and mfspr instructions. The 
encoding is described in Section 4. 4. 2. 2, “Move to/from Special-Purpose Register Instructions 
(OEA)”. 

tbr (11-20) 

This field is used to specify either the time base lower (TBL) or time base upper (TBU). 


Split fields that represent the concatenation of the sequences in some order, which need not 
be left to right (as described for each affected instruction), are shown in uppercase letters. 
These split fields - MB, ME, and SH- are described in Table 8-2. 

8.1.2 Instruction Fields 

Table 8-2 describes the instruction fields used in the various instruction formats. 


Table 8-2. Instruction Syntax Conventions 


Field 

Description 

AA (30) 

Absolute address bit. 

0 The immediate field represents an address relative to the current instruction address (CIA). (For 
more information on the CIA, see Table 8-3.) The effective (logical) address of the branch is 
either the sum of the LI field sign-extended to 32 bits and the address of the branch instruction 
or the sum of the BD field sign-extended to 32 bits and the address of the branch instruction. 

1 The immediate field represents an absolute address. The effective address (EA) of the branch is 
the LI field sign-extended to 32 bits or the BD field sign-extended to 32 bits. 

Note: The LI and BD fields are sign-extended to 32 bits. 

BD (16-29) 

Immediate field specifying a 14-bit signed two's complement branch displacement that is 
concatenated on the right with ObOO and sign-extended to 32 bits. 

Bl (11-15) 

This field is used to specify a bit in the CR to be used as the condition of a branch conditional 
instruction. 

BO (6-10) 

This field is used to specify options for the branch conditional instructions. The encoding is 
described in Section 4. 2. 4. 2, “Conditional Branch Control”. 

crbA (11-15) 

This field is used to specify a bit in the CR to be used as a source. 

crbB (16-20) 

This field is used to specify a bit in the CR to be used as a source. 

crbD (6-10) 

This field is used to specify a bit in the CR, or in the FPSCR, as the destination of the result of an 
instruction. 

crfD (6-8) 

This field is used to specify one of the CR fields, or one of the FPSCR fields, as a destination. 

crfS (11-13) 

This field is used to specify one of the CR fields, or one of the FPSCR fields, as a source. 


8-2 


PowerPC Microprocessor Family: The Programming Environments 
































Table 8-2. Instruction Syntax Conventions (Continued) 


Field 

Description 

CRM (12-19) 

This field mask is used to identify the CR fields that are to be updated by the mtcrf instruction. 

d (1 6-31 ) 

Immediate field specifying a signed two's complement integer that is sign-extended to 32 bits. 

FM (7-14) 

This field mask is used to identify the FPSCR fields that are to be updated by the mtfsf instruction. 

frA (11-15) 

This field is used to specify an FPR as a source. 

frB (16-20) 

This field is used to specify an FPR as a source. 

frC (21-25) 

This field is used to specify an FPR as a source. 

frD (6-10) 

This field is used to specify an FPR as the destination. 

frS (6-1 0) 

This field is used to specify an FPR as a source. 

IMM (16-19) 

Immediate field used as the data to be placed into a field in the FPSCR. 

LI (6-29) 

Immediate field specifying a 24-bit signed two's complement integer that is concatenated on the 
right with ObOO and sign-extended to 32 bits. 

LK (31) 

Link bit. 

0 Does not update the link register (LR). 

1 Updates the LR. If the instruction is a branch instruction, the address of the instruction following 
the branch instruction is placed into the LR. 

MB (21-25) and 
ME (26-30) 

These fields are used in rotate instructions to specify a 32-bit mask consisting of 1 bits from bit MB 
through bit ME inclusive, and 0 bits elsewhere, as described in Section 4.2.1 .4, "Integer Rotate and 
Shift Instructions,”. 

NB (16-20) 

This field is used to specify the number of bytes to move in an immediate string load or store. 

OE (21) 

This field is used for extended arithmetic to enable setting OV and SO in the XER. 

OPCD (0-5) 

Primary opcode field 

rA (11-15) 

This field is used to specify a GPR to be used as a source or destination. 

rB (16-20) 

This field is used to specify a GPR to be used as a source. 

Rc (31) 

Record bit. 

0 Does not update the condition register (CR). 

1 Updates the CR to reflect the result of the operation. 

For integer instructions, CR bits 0-2 are set to reflect the result as a signed quantity and CR bit 

3 receives a copy of the summary overflow bit, XER[SO]. The result as an unsigned quantity or 
a bit string can be deduced from the EQ bit. For floating-point instructions, CR bits 4-7 are set 
to reflect floating-point exception, floating-point enabled exception, floating-point invalid 
operation exception, and floating-point overflow exception. 

(Note: Exceptions are referred to as interrupts in the architecture specification.) 

rD (6-10) 

This field is used to specify a GPR to be used as a destination. 

rS (6-10) 

This field is used to specify a GPR to be used as a source. 

SH (16-20) 

This field is used to specify a shift amount. 

SIMM (16-31) 

This immediate field is used to specify a 16-bit signed integer. 

SR (12-15) 

This field is used to specify one of the 1 6 segment registers. 





















































Table 8-2. Instruction Syntax Conventions (Continued) 


Field 

Description 

TO (6-10) 

This field is used to specify the conditions on which to trap. The encoding is described in Section 

4. 2. 4. 6, “Trap Instructions.’’ 

UIMM (16-31) 

This immediate field is used to specify a 16-bit unsigned integer. 

XO (21-30, 
22-30, 26-30) 

Extended opcode field. 


8.1.3 Notation and Conventions 

The operation of some instructions is described by a semiformal language (pseudocode). 
See Table 8-3 for a list of pseudocode notation and conventions used throughout this 
chapter 

Table 8-3. Notation and Conventions 


Notation/Convention 

Meaning 


Assignment 

^ — iea 

Assignment of an instruction effective address. . 

-i 

NOT logical operator 

* 

Multiplication 

-r- 

Division (yielding quotient) 

+ 

Two's-complement addition 

- 

Two's-complement subtraction, unary minus 

=,* 

Equals and Not Equals relations 

< ,<>, >, 

Signed comparison relations 

. (period) 

Update. When used as a character of an instruction mnemonic, a period (.) means that the 
instruction updates the condition register field. 

c 

Carry. When used as a character of an instruction mnemonic, a ‘c’ indicates a carry out in 
XER[CA], 

e 

Extended Precision. 

When used as the last character of an instruction mnemonic, an ‘e’ indicates the use of 
XER[CA] as an operand in the instruction and records a carry out in XER[CA], 

0 

Overflow. When used as a character of an instruction mnemonic, an ‘o' indicates the record 
of an overflow in XER[OV] and CRO[SO] for integer instructions or CR1[SO] for floating-point 
instructions. 

<U, >U 

Unsigned comparison relations 

? 

Unordered comparison relation 

&. 1 

AND, OR logical operators 

II 

Used to describe the concatenation of two values (that is, 010 || 111 is the same as 0101 1 1) 

©,= 

Exclusive-OR, Equivalence logical operators (for example, (a = b) = (a © -■ b)) 
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Table 8-3. Notation and Conventions (Continued) 


Notation/Convention 

Meaning 

Ob nnnn 

A number expressed in binary format. 

0 xnnnn or 
x’nnnn nnnn’ 

A number expressed in hexadecimal format. 

(n)x 

The replication of x, n times (that is, x concatenated to itself n - 1 times). 

(n) 0 and (n)1 are special cases. A description of the special cases follows: 

• (n) 0 means a field of n bits with each bit equal to 0. Thus (5)0 is equivalent to 

ObOOOOO. 

• (n)1 means a field of n bits with each bit equal to 1 . Thus (5)1 is equivalent to 

Obi 11 11. 

(rA|0) 

The contents of rA if the rA field has the value 1-31 , or the value 0 if the rA field is 0. 

(rX) 

The contents of rX 

m 

n is a bit or field within x, where x is a register 

x n 

x is raised to the nth power 

ABS(x) 

Absolute value of x 

CEIL(x) 

Least integer x 

Characterization 

Reference to the setting of status bits in a standard way that is explained in the text. 

CIA 

Current instruction address. 

The 32-bit address of the instruction being described by a sequence of pseudocode. Used by 
relative branches to set the next instruction address (NIA) and by branch instructions with LK 
= 1 to set the link register. Does not correspond to any architected register. 

Clear 

Clear the leftmost or rightmost n bits of a register to 0. This operation is used for rotate and 
shift instructions. 

Clear left and shift left 

Clear the leftmost b bits of a register, then shift the register left by n bits. This operation can 
be used to scale a known non-negative array index by the width of an element. These 
operations are used for rotate and shift instructions. 

Cleared 

Bits are set to 0. 

Do 

Do loop. 

• Indenting shows range. 

• “To” and/or “by” clauses specify incrementing an iteration variable. 

• “While” clauses give termination conditions. 

DOUBLE(x) 

Result of converting x from floating-point single-precision format to floating-point double- 
precision format. 

Extract 

Select a field of n bits starting at bit position b in the source register, right or left justify this 
field in the target register, and clear all other bits of the target register to zero. This operation 
is used for rotate and shift instructions. 

EXTS(x) 

Result of extending x on the left with sign bits 

GPR(x) 

General-purpose register x 

if.. .then. ..else... 

Conditional execution, indenting shows range, else is optional. 















































Table 8-3. Notation and Conventions (Continued) 


Notation/Convention 

Meaning 

Insert 

Select a field of n bits in the source register, insert this field starting at bit position b of the 
target register, and leave other bits of the target register unchanged. (No simplified 
mnemonic is provided for insertion of a field when operating on double words; such an 
insertion requires more than one instruction.) This operation is used for rotate and shift 
instructions. (Note: Simplified mnemonics are referred to as extended mnemonics in the 
architecture specification.) 

Leave 

Leave innermost do loop, or the do loop described in leave statement. 

MASK(x, y) 

Mask having ones in positions x through y (wrapping if x > y) and zeros elsewhere. 

MEM(x, y) 

Contents of y bytes of memory starting at address x. 

NIA 

Next instruction address, which is the 32-bit address of the next instruction to be executed 
(the branch destination) after a successful branch. In pseudocode, a successful branch is 
indicated by assigning a value to NIA. For instructions which do not branch, the next 
instruction address is CIA + 4. Does not correspond to any architected register. 

OEA 

PowerPC operating environment architecture 

Rotate 

Rotate the contents of a register right or left n bits without masking. This operation is used for 
rotate and shift instructions. 

Reserved 

An unused field, must be left with zeros. 

ROTL(x, y) 

Result of rotating the value x left y positions, where x is 32 bits long 

Set 

Bits are set to 1 . 

Shift 

Shift the contents of a register right or left n bits, clearing vacated bits (logical shift). This 
operation is used for rotate and shift instructions. 

SINGLE(x) 

Result of converting x from floating-point double-precision format to floating-point single- 
precision format. 

SPR(x) 

Special-purpose register x 

TRAP 

Invoke the system trap handler. 

Undefined 

An undefined value. The value may vary from one implementation to another, and from one 
execution to another on the same implementation. 

UISA 

PowerPC user instruction set architecture 

VEA 

PowerPC virtual environment architecture 









































Table 8-4 describes instruction field notation conventions used throughout this chapter. 

Table 8-4. Instruction Field Conventions 


The Architecture 
Specification 

Equivalent to: 

BA, BB, BT 

crbA, crbB, crbD (respectively) 

BF, BFA 

crfD, crfS (respectively) 

D 

d 

DS 

ds 

FLM 

FM 

FRA, FRB, FRC, FRT, FRS 

frA, frB, frC, frD, frS (respectively) 

FXM 

CRM 

RA, RB, RT, RS 

rA, rB, rD, rS (respectively) 

SI 

SIMM 

U 

IMM 

Ul 

UIMM 

/, //, III 

0...0 (shaded) 


Precedence rules for pseudocode operators are summarized in Table 8-5. 

Table 8-5. Precedence Rules 


Operators 

Associativity 

x[n], function evaluation 

Left to right 

(n)x or replication, 
x(n) or exponentiation 

Right to left 

unary -, -> 

Right to left 

*, 

Left to right 

+ , ~ 

Left to right 

II 

Left to right 

<,U>U, ? 

Left to right 

&, ©, = 

Left to right 

1 

Left to right 

- (range) 

None 

^ ^ — iea 

None 


Operators higher in Table 8-5 are applied before those lower in the table. Operators at the 
same level in the table associate from left to right, from right to left, or not at all, as shown. 
For example, (unary minus) associates from left to right, soa-b-c = (a-b)-c. 
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Parentheses are used to override the evaluation order implied by Table 8-5, or to increase 
clarity; parenthesized expressions are evaluated before serving as operands. 

8.1.4 Computation Modes 

The PowerPC architecture is defined for 32-bit implementations, in which all registers 
except the FPRs are 32 bits long, and effective addresses are 32 bits long. The FPR registers 
are 64 bits long. For more information on computation modes see Section 4.1.1, 
“Computation Modes.” 

8.2 PowerPC Instruction Set 

The remainder of this chapter lists and describes the instruction set for the PowerPC 
architecture. The instructions are listed in alphabetical order by mnemonic. Figure 8-1 
shows the format for each instruction description page. 


Instruction name 

name (Instruction operation codes in 
hexadecimal) 


Instruction syntax 


Instruction encoding 


Pseudocode description 

of instruction operation 

Text description of 
instruction operation 
Registers altered by instruction 


Quick reference legend 


addx 

Add (x’7C00 0214’) 



add 

add 

rD,rA,rB 


(OE = 0 Rc = 0) 

add. 

rD,rA,rB 


(OE = ORc= 1) 

addo 

rD,rA,rB 


(OE = 1 Rc = 0) 

addo. 

rD,rA,rB 


(OE = 1 Rc = 1) 





1 31 1 D 1 

A j 

-B |OE| 

266 | Rc | 

0 56 10 11 

15 16 

20 21 22 

30 31 


rD<- (rA) + (rB) 

The sura (rA) + (rB) is placed into rD. 
Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO(If Rc = 1 ) 
■ XER: 

Affected: SO, OV(If OE = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UiSA 



XO 


Figure 8-1. Instruction Description 

NOTE: The execution unit that executes the instruction may not be the same for all 
PowerPC processors. 
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add* 


addx 

Add (x’7C00 0214’) 

rD,rA,rB (OE = 0 Rc = 0) 
rD,rA,rB (OE = 0 Rc = 1) 
rD,rA,rB (OE = 1 Rc = 0) 
rD,rA,rB (OE = lRc = l) 


add 

add. 

addo 

addo. 


31 

D 

A 

B 

m 

266 



0 5 6 10 11 15 16 20 21 22 30 31 

rD 4- (rA) + (rB) 


The sum (rA) + (rB) is placed into rD. 

The add instruction is preferred for addition because it sets few status bits. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 

NOTE: CR0 field may not reflect the infinitely precise result if overflow occurs (see 
next bullet item. 

• XER: 

Affected: SO, OV (If OE = 1) 

NOTE: For more information on condition codes see Section 2.1.3, “Condition 
Register,” and Section 2.1.5, “XER Register.” 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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addcx 



addcx 

Add Carrying (x’7C00 0014’) 



addc 

rD,rA,rB 

(OE = 0 Re = 0) 


addc. 

rD,rA,rB 

(OE = 0 Re = 1) 


addco 

rD,rA,rB 

(OE = 1 Re = 0) 


addco. 

rD,rA,rB 

(OE = 1 Re = 1) 



31 

D 

A 

B 

m 

10 



0 5 6 10 11 15 16 20 21 22 30 31 


rD (rA) + (rB) 

The sum (rA) + (rB) is placed into rD. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Re = 1) 

NOTE: CRO field may not reflect the infinitely precise result if overflow occurs (see 
next bullet item). 

• XER: 

Affected: CA 

Affected: SO, OV (If OE = 1) 

NOTE: For more information on condition codes see Section 2.1.3, “Condition 
Register,” and Section 2.1.5, “XER Register.” 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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addex 



addex 

Add Extended (x’7C00 0114’) 



adde 

rD,rA,rB 

(OE = 0 Re = 0) 


adde. 

rD,rA,rB 

r" H 

II 

o 

Pi 

o 

II 

W 

O 


addeo 

rD,rA,rB 

(OE = 1 Re = 0) 


addeo. 

rD,rA,rB 

(OE = 1 Re = 1) 



31 

D 

A 

B 

m 

138 



0 5 6 10 11 15 16 20 21 22 30 31 


rD <r- (rA) + (rB) + XER[CA] 

The sum (rA) + (rB) + XER[CA] is placed into rD. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Re = 1) 

NOTE: CRO field may not reflect the infinitely precise result if overflow occurs (see 
next bullet item). 

• XER: 

Affected: CA 

Affected: SO, OV (If OE = 1) 

NOTE: For more information on condition codes see Section 2.1.3, “Condition 
Register,” and Section 2.1.5, “XER Register.” 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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addi 


addi 

Add Immediate (x’3800 0000’) 
addi rD,rA,SIMM 


14 

D 

A 

SIMM 


0 56 10 11 15 16 31 


if rA = 0 

then rD <-EXTS(SIMM) 
else rD <— (rA) + EXTS(SIMM) 

The sum (rAIO) + sign extended SIMM is placed into rD. 

The addi instruction is preferred for addition because it sets few status bits. 
NOTE: addi uses the value 0, not the contents of GPR0, if rA = 0. 

Other registers altered: 

• None 

Simplified mnemonics: 


li 

rD, value 

equivalent to 

addi 

rD,0, value 

la 

rD,disp(rA) 

equivalent to 

addi 

rD,rA,disp 

subi 

rD,rA,value 

equivalent to 

addi 

rD,rA,-value 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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addic addic 

Add Immediate Carrying (x’3000 0000’) 
addic rD,rA,SIMM 


12 

D 

A 

SIMM 


0 56 10 11 15 16 31 

rD <— (rA) + EXTS(SIMM) 


The sum (rA) + sign extended SIMM is placed into rD. 

Other registers altered: 

• XER: 

Affected: CA 

NOTE: The setting of the affected bits in the XER reflects overflow of the 32-bit 
result. For more information see Section 2.1.5, “XER Register”. 

Simplified mnemonics: 

subic rD,rA,value equivalent to addic rD,rA,-value 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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addic. addic. 

Add Immediate Carrying and Record (x’3400 0000’) 
addic. rD,rA,SIMM 


13 

D 

A 

SIMM 


0 56 10 11 15 16 31 

rD <- (rA) + EXTS(SIMM) 


The sum (rA) + the sign extended SIMM is placed into rD. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO 

NOTE: CR0 field may not reflect the infinitely precise result if overflow occurs (see 
next bullet item). 

• XER: 

Affected: CA 

NOTE: For more information on condition codes see Section 2.1.3, “Condition 
Register,” and Section 2.1.5, “XER Register”. 

Simplified mnemonics: 

subic. rD,rA,value equivalent to addic. rD,rA,-value 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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addis 


addis 

Add Immediate Shifted (x’3C00 0000’) 
addis rD,rA,SIMM 


15 

D 

A 

SIMM 


0 56 10 11 15 16 31 


if rA = 0 

then rDf- (SIMM || (16)0) 

else rDf- (rA) + (SIMM || (16)0) 


The sum (rAIO) + (SIMM II 0x0000) is placed into rD. 

The addis instruction is preferred for addition because it sets few status bits. 
NOTE: addis uses the value 0, not the contents of GPR0, if rA = 0. 

Other registers altered: 

• None 

Simplified mnemonics: 

lis rD, value 
subis rD,rA, value 


equivalent to 
equivalent to 


addis rD,0, value 
addis rD,rA, -value 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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addmex addmex 

Add to Minus One Extended (x’7C00 01 D4’) 


addme 

rD,rA 

(OE 

= 0 

Re = 

0) 





addme. 

rD,rA 

(OE 

= 0 

Re = 

1) 





addmeo 

rD,rA 

(OE 

= 1 

Re = 

0) 





addmeo. 

rD,rA 

(OE 

= 1 

Re = 

1) 














1 | Reserved 

31 

D 

A 


0000 

0 

Si 

234 


0 5 

6 10 

11 

15 

16 


20 

21 

22 

30 

31 


rDf- (rA) + XER [CA] - 1 


The sum (rA) + XER[CA] + OxFFFF_FFFF is placed into rD. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: FT, GT, EQ, SO (If Re = 1) 

NOTE: CRO field may not reflect the infinitely precise result if overflow occurs (see 
next bullet item). 


• XER: 

Affected: CA 

Affected: SO, OV (If OE = 1) 

NOTE: For more information on condition codes see Section 2.1.3, “Condition 
Register,” and Section 2.1.5, “XER Register”. 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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addzex 


addzex 

Add to Zero Extended (x’7C00 0194’) 


addze 

rD,rA 

(OE = 0 Re = 0) 

addze. 

rD,rA 

(OE = 0 Re = 1) 

addzeo 

rD,rA 

(OE = 1 Re = 0) 

addzeo. 

rD,rA 

(OE = 1 Re = 1) 


□ Reserved 


31 

D 

A 

0000 0 

m 

202 



0 5 6 10 11 15 16 20 21 22 30 31 

rD <r- (rA) + XER [CA] 


The sum (rA) + XER[CA] is placed into rD. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Re = 1) 

NOTE: CRO field may not reflect the infinitely precise result if overflow occurs (see 
next). 

• XER: 

Affected: CA 

Affected: SO, OV (If OE = 1) 

NOTE: For more information on condition codes see Section 2.1.3, “Condition 
Register,” and Section 2.1.5, “XER Register”. 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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and* 


andx 

AND (x’7C00 0038’) 

and rA,rS,rB (Rc = 0) 

and. rA,rS,rB (Rc = 1) 


31 

S 

A 

B 

28 



0 5 6 10 11 15 16 20 21 30 31 


rA4— (rS) & (rB) 

The contents of rS are ANDed with the contents of rB and the result is placed into rA. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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andcx 

andc rA,rS,rB (Rc = 0) 

andc. rA,rS,rB (Rc = 1) 


andcx 

AND with Complement (x’7C00 0078’) 


31 

S 

A 

B 

60 



0 56 1011 1516 20 21 30 31 


rA«- (rS) & (rB) 

The contents of rS are ANDed with the one’s complement of the contents of rB and the 
result is placed into rA. 

Other registers altered: 

• Condition Register (CR0 field): Affected: LT, GT, EQ, SO(If Rc = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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andi 


andi. 

AND Immediate (x’7000 0000’) 
andi. rA,rS,UIMM 


28 

S 

A 

UIMM 


0 5610111516 31 


rAf- (rS) & ((16)0 || UIMM) 

The contents of rS are ANDed with 0x000 II UIMM and the result is placed into rA. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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andis. andis. 

AND Immediate Shifted (x’7400 0000’) 
andis. rA,rS,UIMM 


29 

S 

A 

UIMM 


0 56 10 11 15 16 31 


rA <r- (rS) & (UIMM || (16)0) 

The contents of rS are ANDed with UIMM II 0x0000 and the result is placed into rA. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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bx 


bx 

Branch (x’4800 0000’) 



b 

target_addr 

(AA = 0 LK = 0) 

ba 

target_addr 

(AA = 1 LK = 0) 

bl 

target_addr 

(AA = 0 LK = 1) 

bla 

target_addr 

(AA = 1 LK = 1) 


18 

LI 

AA 

LK 


0 5 6 29 30 31 


if 

AA = 

1 



then 

else 

if 

LK = 

1 



then 


NIA<— iea EXTS (LI || ObOO) 

NIA 4— iea CIA + EXTS (LI | | ObOO) 

LR< — iea CIA + 4 


target_addr specifies the branch target address. 

If AA = 1, then the branch target address is the value LI II ObOO sign-extended. 

If AA = 0, then the branch target address is the sum of LI II ObOO sign-extended plus the 
address of this instruction. 

If LK = 1, then the effective address of the instruction following the branch instruction is 
placed into the link register. 

Other registers altered: 

Affected: Link Register (LR) (If LK =1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



1 
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Branch Conditional (x’4000 0000’) 


be 

B O ,B I,target_addr 

(AA = 0 LK = 0) 

bca 

B O ,B I,target_addr 

(AA = 1 LK = 0) 

bcl 

B O ,B I,target_addr 

(AA = 0 LK = 1) 

bcla 

B O ,B I,target_addr 

(AA = 1 LK = 1) 


16 

BO 

BI 

BD 

AA 

LK 


0 5 6 10 11 15 16 29 30 31 


if -■ BO [2] 

then CTR 4- CTR - 1 
ctr_ok 4- BO [2] | ((CTR ^ 0) © BO [3]) 

cond_ok 4— BO [ 0 ] | (CR[BI] = BO[l] ) 

if ctr_ok & cond_ok 
then 

if AA = 1 

then NIA 4-iea EXTS (BD | | ObOO) 
else NIA 4-iea CIA + EXTS (BD | | ObOO) 

if LK = 1 

then LR 4— iea CIA + 4 


The BI field specifies the bit in the condition register (CR) to be used as the condition of 
the branch. The BO field is encoded as described in Table 8-6. Additional information 
about BO field encoding is provided in Section 4. 2.4. 2, “Conditional Branch Control”. 

Table 8-6. BO Operand Encodings 


BO 

Description 

0000/ 

Decrement the CTR, then branch if the decremented CTR 0 and the condition is IALSE. 

0001 y 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is FALSE. 

001 zy 

Branch if the condition is FALSE. 

OlOOy 

Decrement the CTR, then branch if the decremented CTR 0 and the condition is TRJE. 

OlOly 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is TRUE. 

Ollzy 

Branch if the condition is TRUE. 

IzOOy 

Decrement the CTR, then branch if the decremented CTR 0. 

IzOly 

Decrement the CTR, then branch if the decremented CTR = 0. 

Izlzz 

Branch always. 

In this table, z indicates a bit that is ignored. 

Note: The z bits should be cleared, as they may be assigned a meaning in some future version of the 

PowerPC architecture. 

The y bit provides a hint about whether a conditional branch is likely to be taken, and may be used by some 
PowerPC implementations to improve performance. 
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target_addr specifies the branch target address. 

If A A = 0, the branch target address is the sum of BD II ObOO sign-extended and the address 
of this instruction. 

If AA = 1, the branch target address is the value BD II ObOO sign-extended. 

If LK = 1 , the effective address of the instruction following the branch instruction is placed 
into the link register. 

Other registers altered: 

Affected: Count Register (CTR) (If BO[2] = 0) 

Affected: Link Register (LR) (If LK =1) 

Simplified mnemonics: 


bit 

target 

equivalent to 

be 

12,0, target 

bne 

cr2, target 

equivalent to 

be 

4, 10, target 

bdnz 

target 

equivalent to 

be 

16,0, target 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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bcctrx 



bcctrx 

Branch Conditional to Count Register (x 

’4C00 0420’) 


bcctr 

BO,BI 

(LK = 0) 


bcctrl 

BO,BI 

(LK= 1) 



□ Reserved 


19 

BO 

BI 

0000 0 

528 

LK 


0 56 1011 1516 20 21 30 31 


cond_ok <- BO[0] I (CR[BI] = BO[l] ) 
if cond_ok 

then NIA A-iea CTR[0-29] | | ObOO 

if LK 

then LR f-iea CIA + 4 

The BI field specifies the bit in the condition register to be used as the condition of the 
branch. The BO field is encoded as described in Table 8-7. Additional information about 
BO field encoding is provided in Section 4. 2.4. 2, “Conditional Branch Control”. 


Table 8-7. BO Operand Encodings 


BO 

Description 

0000/ 

Decrement the CTR, then branch if the decremented CTR 0 and the condition is IALSE. 

0001 y 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is FALSE. 

001 zy 

Branch if the condition is FALSE. 

OlOOy 

Decrement the CTR, then branch if the decremented CTR 0 and the condition is TFUE. 

OlOly 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is TRUE. 

Ollzy 

Branch if the condition is TRUE. 

IzOOy 

Decrement the CTR, then branch if the decremented CTR 0. 

IzOly 

Decrement the CTR, then branch if the decremented CTR = 0. 

Izlzz 

Branch always. 

In this table, z indicates a bit that is ignored. 

Note: The z bits should be cleared, as they may be assigned a meaning in some future version of the 

PowerPC architecture. 

The y bit provides a hint about whether a conditional branch is likely to be taken, and may be used by some 
PowerPC implementations to improve performance. 


The branch target address is CTR[0-29] II ObOO. 

If LK = 1 , the effective address of the instruction following the branch instruction is placed 
into the link register. 
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If the “decrement and test CTR” option is specified (BO [2] = 0), the instruction form is 
invalid. 

Other registers altered: 

Affected: Link Register (LR) (If LK =1) 

Simplified mnemonics: 

bltctr equivalent to bcctr 12,0 

bnectr cr2 equivalent to bcctr 4,10 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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bclrx 


bclrx 

Branch Conditional to Link Register (x’4C00 0020’) 

bclr BO,BI (LK = 0) 

bclrl BO,BI (LK= 1) 


□ Reserved 


19 

BO 

BI 

0000 0 

16 



0 56 1011 1516 20 21 30 31 


if BO [2] 

then CTR 4- CTR - 1 

ctr_ok 4- BO [2] | ((CTR ^ 0)© BO[3]) 

cond_ok 4 — BO [ 0 ] | (CR[BI] = BO[l] ) 

if ctr_ok & cond_ok 

then NIA 4-iea LR[0-29] | | ObOO 

if LK 

then LR 4— iea CIA + 4 


The BI field specifies the bit in the condition register to be used as the condition of the 
branch. The BO field is encoded as described in Table 8-8. Additional information about 
BO field encoding is provided in Section 4. 2.4. 2, “Conditional Branch Control”. 


Table 8-8. BO Operand Encodings 


BO 

Description 

OOOOy 

Decrement the CTR, then branch if the decremented CTR 0 and the condition is FALSE. 

0001 y 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is FALSE. 

001 zy 

Branch if the condition is FALSE. 

OlOOy 

Decrement the CTR, then branch if the decremented CTR 0 and the condition is TFUE. 

OlOly 

Decrement the CTR, then branch if the decremented CTR = 0 and the condition is TRUE. 

Ollzy 

Branch if the condition is TRUE. 

IzOOy 

Decrement the CTR, then branch if the decremented CTR 0. 

IzOly 

Decrement the CTR, then branch if the decremented CTR = 0. 

Izlzz 

Branch always. 

If the BO field specifies that the CTR is to be decremented, the entire 32-bit CTR is decremented. 

In this table, z indicates a bit that is ignored. 

Note: The z bits should be cleared, as they may be assigned a meaning in some future version of the 
PowerPC architecture. 

The y bit provides a hint about whether a conditional branch is likely to be taken, and may be used by 
some PowerPC implementations to improve performance. 


The branch target address is LR[0-29] II ObOO. 
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If LK = 1, then the effective address of the instruction following the branch instruction is 
placed into the link register. 

Other registers altered: 

Affected: Count Register (CTR) (If BO[2] = 0) 

Affected: Link Register (LR) (If LK =1) 

Simplified mnemonics: 


bltlr 

equivalent to 

bclr 

12,0 

bnelr cr2 

equivalent to 

bclr 

4,10 

bdnzlr 

equivalent to 

bclr 

16,0 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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cmp 

Compare (x’7C00 0000’) 


cmp 


crfD,L,rA,rB 


□ Reserved 


crfD 


56 8 9 10 11 


0000000000 


15 16 


20 21 


30 31 


a <r- (rA) 
b 4- (rB) 
if a < b 

then c 4 — Obi 00 
else if a > b 

then c 4 — ObOlO 
else c 4 — ObOOl 

CR [ ( 4 * erf D ) - ( 4 * crfD + 3)] 4 - c || XER[SO] 

The contents of rA are compared with the contents of rB, treating the operands as signed 
integers. The result of the comparison is placed into CR field crfD. 

NOTE: If L = 1, the instruction form is invalid. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 

Affected: LT, GT, EQ, SO 
Simplified mnemonics: 

empd rA,rB equivalent to cmp 0,l,rA,rB 

empw cr3,rA,rB equivalent to cmp 3,0,rA,rB 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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cmpi cmpi 

Compare Immediate (x’2C00 0000’) 
cmpi crfD, L,rA, SIMM 

□ Reserved 

11 crfD 0 L A SIMM 

0 5 6 8 9 10 11 15 16 31 

a 4- (rA) 

if a < EXTS (SIMM) 
then c 4— OblOO 
else if a > EXTS (SIMM) 

then c 4— ObOlO 
else c 4— ObOOl 

CR [ (4 * erf D ) - ( 4 * crfD + 3)] 4- c || XER[SO] 

The contents of rA are compared with the sign-extended value of the SIMM field, treating 
the operands as signed integers. The result of the comparison is placed into CR field crfD. 

NOTE: If L = 1, the instruction form is invalid. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 

Affected: LT, GT, EQ, SO 
Simplified mnemonics: 

empdi rA, value equivalent to cmpi 0,l,rA, value 

cmpwi cr3,rA,value equivalent to cmpi 3,0,rA,value 



8-30 


PowerPC Microprocessor Family: The Programming Environments 










cmpl 


cmpl 

Compare Logical (x’7C00 0040’) 
cmpl crfD,L,rA,rB 


□ Reserved 


31 

crfD 

0 

D 

A 

B 

32 

0 


0 56 8 9 10 11 15 16 20 21 31 


a 4- (rA) 
b 4- (rB) 
if a <U b 

then c 4— OblOO 
else if a >0 b 

then c 4— ObOlO 
else c 4— ObOOl 

CR [ (4 * erf D ) - ( 4 * crfD + 3 )] 4- c || XER[SO] 

The contents of rA are compared with the contents of rB, treating the operands as unsigned 
integers. The result of the comparison is placed into CR field crfD. 

NOTE: If L = 1, the instruction form is invalid. 

Other registers altered: 


• Condition Register (CR field specified by operand crfD): 


Affected: LT, GT, EQ, SO 

Simplified mnemonics: 

cmpld rA,rB equivalent to 

cmplw cr3,rA,rB equivalent to 


cmpl 0,l,rA,rB 
cmpl 3,0,rA,rB 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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cmpli 


cmpli 

Compare Logical Immediate (x’2800 0000’) 
cmpli crfD, L,rA, UIMM 

□ Reserved 

10 crfD 0 L A UIMM 

0 5 6 8 9 10 11 15 16 31 

a 4— (rA) 

if a <U ( (16) 0 | | UIMM) 
then c 4 — OblOO 
else if a >U ((16)0 | | UIMM) 
then c 4— ObOlO 
else c 4— ObOOl 

CR [ (4 * erf D ) - ( 4 * crfD + 3)] 4- c || XER[SO] 

The contents of rA are compared with 0x0000 II UIMM, treating the operands as unsigned 
integers. The result of the comparison is placed into CR field crfD. 

NOTE: If L = 1, the instruction form is invalid. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 

Affected: LT, GT, EQ, SO 
Simplified mnemonics: 

cmpldi r A, value equivalent to cmpli 0,l,rA, value 

cmplwi cr3,rA,value equivalent to cmpli 3,0,rA,value 
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cntlzw* 

Count Leading Zeros Word (x’7C00 0034’) 

cntlzwx 

cntlzw rA,rS (Rc = 0) 

cntlzw. rA,rS (Rc = 1) 



□ Reserved 


31 

S 

A 

0000 0 

26 



0 56 1011 1516 20 21 30 31 


n <r- 0 

do while n < 32 

if rS [n] = 1 

then leave 

n <r- n + 1 
rA 4— n 


A count of the number of consecutive zero bits starting at bit 0 of rS is placed into rA. This 
number ranges from 0 to 32, inclusive. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 

NOTE: If Rc = 1, then LT is cleared in the CRO field. 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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crand 


crand 

Condition Register AND (x’4C00 0202’) 
crand crbD,crbA,crbB 


□ Reserved 


19 

crbD 

crbA 

crbB 

257 

0 


0 5 6 10 11 15 16 20 21 30 31 


CR [ crbD ] 4- CR [crbA] & CR[crbB] 

The bit in the condition register specified by crbA is ANDed with the bit in the condition 
register specified by crbB. The result is placed into the condition register bit specified by 

crbD. 

Other registers altered: 

• Condition Register: 

Affected: Bit specified by operand crbD 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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crandc 


crandc 

Condition Register AND with Complement (x’4C00 0102’) 
crandc crbD,crbA,crbB 


□ Reserved 


19 

crbD 

crbA 

crbB 

129 

0 


0 5 6 10 11 15 16 20 21 30 31 


CR [ crbD ] <r- CR [crbA] & -■ CR[crbB] 

The bit in the condition register specified by crbA is ANDed with the complement of the 
bit in the condition register specified by crbB and the result is placed into the condition 
register bit specified by crbD. 

Other registers altered: 

• Condition Register: 

Affected: Bit specified by operand crbD 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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creqv creqv 

Condition Register Equivalent (x’4C00 0242’) 
creqv crbD,crbA,crbB 


□ Reserved 


19 

crbD 

crbA 

crbB 

289 

0 


0 5 6 10 11 15 16 20 21 30 31 


CR [ crbD ] CR [crbA] = CR[crbB] 

The bit in the condition register specified by crbA is XORed with the bit in the condition 
register specified by crbB and the complemented result is placed into the condition register 
bit specified by crbD. 

Other registers altered: 

• Condition Register: 

Affected: Bit specified by operand crbD 
Simplified mnemonics: 

crse crbD equivalent to creqv crbD, crbD, crbD 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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crnand 


crnand 

Condition Register NAND (x’4C00 01 C2’) 
crnand crbD,crbA,crbB 


□ Reserved 


19 

crbD 

crbA 

crbB 

225 

0 


0 5 6 10 11 15 16 20 21 30 31 


CR [ crbD ] 4- -i (CR [crbA] & CR[crbB]) 

The bit in the condition register specified by crbA is ANDed with the bit in the condition 
register specified by crbB and the complemented result is placed into the condition register 
bit specified by crbD. 

Other registers altered: 

• Condition Register: 

Affected: Bit specified by operand crbD 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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crnor 

Condition Register NOR (x’4C00 0042’) 


crnor 


crnor crbD,crbA,crbB 


□ Reserved 


19 

crbD 

crbA 

crbB 

33 

0 


0 5 6 10 11 15 16 20 21 30 31 


CR [ crbD ] 4- -i (CR [crbA] | CR[crbB]) 

The bit in the condition register specified by crbA is ORed with the bit in the condition 
register specified by crbB and the complemented result is placed into the condition register 
bit specified by crbD. 

Other registers altered: 

• Condition Register: 

Affected: Bit specified by operand crbD 
Simplified mnemonics: 

crnot crbD, crbA equivalent to crnor crbD, crbA, crbA 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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cror 

Condition Register OR (x’4C00 0382’) 


cror 


cror crbD,crbA,crbB 


□ Reserved 


19 

crbD 

crbA 

crbB 

449 

0 


0 5 6 10 11 15 16 20 21 30 31 


CR [ crbD ] 4- CR [crbA] | CR[crbB] 

The bit in the condition register specified by crbA is ORed with the bit in the condition 
register specified by crbB. The result is placed into the condition register bit specified by 

crbD. 

Other registers altered: 

• Condition Register: 

Affected: Bit specified by operand crbD 
Simplified mnemonics: 

crmove crbD, crbA equivalent to cror crbD, crbA, crbA 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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crorc 

Condition Register OR with Complement (x’4C00 0342’) 


crorc 


crorc crbD,crbA,crbB 


□ Reserved 


19 

crbD 

crbA 

crbB 

417 

0 

0 

5 

6 10 

11 

15 

16 

20 

21 


30 31 


CR [ crbD ] 

<r- CR [crbA] 

1 

CR [ crbB ] 








The bit in the condition register specified by crbA is ORed with the complement of the 
condition register bit specified by crbB and the result is placed into the condition register 
bit specified by crbD. 

Other registers altered: 

• Condition Register: 

Affected: Bit specified by operand crbD 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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crxor 

Condition Register XOR (x’4C00 0182’) 


crxor 


crxor crbD,crbA,crbB 


□ Reserved 


19 

crbD 

crbA 

crbB 

193 

0 


0 5 6 10 11 15 16 20 21 30 31 


CR [ crbD ] 4- CR [crbA] © CR[crbB] 

The bit in the condition register specified by crbA is XORed with the bit in the condition 
register specified by crbB and the result is placed into the condition register specified by 

crbD. 

Other registers altered: 

• Condition Register: 

Affected: Bit specified by crbD 
Simplified mnemonics: 

crclr crbD equivalent to crxor crbD, crbD, crbD 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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dcba 


dcba 

Data Cache Block Allocate (x’7C00 05EC’) 
dcba rA,rB 


□ Reserved 


31 

0 0 0 0 0 

A 

B 

758 

0 


0 56 1011 1516 20 21 30 31 


EA is the sum (rAIO) + (rB). 

The dcba instruction allocates the block in the data cache addressed by EA, by marking it 
valid without reading the contents of the block from memory; the data in the cache block 
is considered to be undefined after this instruction completes. This instruction is a hint that 
the program will probably soon store into a portion of the block, but the content of the rest 
of the block are not meaningful to the program (eliminating the needed to read the block 
from main memory), and can provide for improved performance in these code sequences. 

The dcba instruction executes as follows: 

• If the cache block containing the byte addressed by EA is in the data cache, the 
contents of all bytes are made undefined but the cache block is still considered valid. 

NOTE: Programming errors can occur if the data in this cache block is 
subsequently read or used inadvertently. 

• If the cache block containing the byte addressed by EA is not in the data cache and 
the corresponding memory page or block is caching-allowed, the cache block is 
allocated (and made valid) in the data cache without fetching the block from main 
memory, and the value of all bytes is undefined. 

• If the addressed byte corresponds to a cache-inhibited page or block this instruction 
is treated as a no-op. (i.e. if the I bit is set), 

• If the cache block containing the byte addressed by EA is in coherency-required 
memory, and the cache block exists in the data cache(s) of any other processor(s), it 
is kept coherent in those caches (i.e. the processor preforms the appropriate bus 
transactions to enforce this). 

This instruction is treated as a store to the addressed byte with respect to address translation 
and memory protection, referenced and changed recording and the ordering enforced by 
eieio or by the combination of caching-inhibited and guarded attributes for a page (or 
block). However, the DSI exception is not invoked for a translation or protection violation, 
and the referenced and changed bits need not be updated when the page or block is cache- 
inhibited (causing the instruction to be treated as a no-op). 
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NOTE: This instruction is optional in the PowerPC architecture. 

Other registers altered: 

• None 

In the PowerPC OEA, the dcba instruction is additionally defined to clear all bytes of a 
newly established block to zero in the case that the block did not already exist in the cache. 

Additionally, as the dcba instruction may establish a block in the data cache without 
verifying that the associated physical address is valid, a delayed machine check exception 
is possible. See Chapter 6, “Exceptions,” for a discussion about this type of machine check 
exception. 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 


V 

X 
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dcbf 


debt 

Data Cache Block Flush (x’7C00 OOAC’) 
dcbf rA,rB 


□ Reserved 


31 

0 0 0 0 0 

A 

B 

86 

0 


0 5 6 10 11 15 16 20 21 30 31 


EA is the sum (rAIO) + (rB). 

The action taken depends on the memory mode associated with the block containing the 
byte addressed by EA and on the state of that block. If the system is a multiprocessor 
implementation and the block is marked coherency-required, the processor will, if 
necessary, send an address-only broadcast to other processors. The broadcast of the dcbf 
instruction causes another processor to copy the block to memory, if it has dirty data, and 
then invalidate the block from the cache. The list below describes the action taken for the 
two states of the memory coherency attribute (M bit). 

• Coherency required (requires the use of address broadcast) 

— Unmodified block — Invalidates copies of the block in the data caches of all 
processor. 

— Modified block — Copies the block to memory and invalidates it. (In what ever 
processor it resides, there should be only one modified block) 

— Absent block — If a modified copy of the block is in the data cache of another 
processor, causes that processor to copied to memory and invalidated it in it’s 
data cache. If unmodified copies are in the data caches of other processors, 
causes those copies to be invalidated in those data caches. 

• Coherency not required (no address broadcast required) 

— Unmodified block — Invalidates the block in the processor’s data cache. 

— Modified block — Copies the block to memory. Invalidates the block in the 
processor’s data cache. 

— Absent block — No action is taken. 

The function of this instruction is independent of the write-through, write-back and 
caching-inhibited/allowed modes of the block containing the byte addressed by EA. This 
instruction is treated as a load from the addressed byte with respect to address translation 
and memory protection. It is also treated as a load for referenced and changed bit recording 
except that referenced and changed bit recording may not occur. 

Other registers altered: None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 



X 
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dcbi 


dcbi 

Data Cache Block Invalidate (x’7C00 03AC’) 
dcbi rA,rB 


□ Reserved 


31 

0 0 0 0 0 

A 

B 

470 

0 


0 5 6 10 11 15 16 20 21 30 31 

EA is the sum (rAIO) + (rB). 


The action taken is dependent on the memory mode associated with the block containing 
the byte addressed by EA and on the state of that block. The list below describes the action 
taken if the block containing the byte addressed by EA is or is not in the cache. 

• Coherency required (requires the use of address broadcast) 

— Unmodified block — Invalidates copies of the block in the data caches of all 
processor. 

— Modified block — Invalidates the copy of the block in the data cache in the 
processor(s) where it is found. (Discards any modified contents) 

— Absent block — If a modified copy of the block is in the data cache of another 
processor, causes that processor to invalidated it in it’s data cache. If unmodified 
copies are in the data caches of other processors, causes those copies to be 
invalidated in those data caches. 

• Coherency not required (no address broadcast required) 

— Unmodified block — Invalidates the block in the processor’s data cache. 

— Modified block — Invalidates the block in the processor’s data cache. (Discards 
any modified contents) 

— Absent block — No action is taken. 

When data address translation is enabled, MSR[DR] = 1, and the virtual address has no 
translation, a DSI exception occurs. 

The function of this instruction is independent of the write-through and caching- 
inhibited/allowed modes of the block containing the byte addressed by EA. This instruction 
operates as a store to the addressed byte with respect to address translation and protection. 
The referenced and changed bits are modified appropriately. 

This is a supervisor-level instruction. 

Other registers altered: None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 

yes 


X 
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dcbst 


dcbst 

Data Cache Block Store (x’7C00 006C’) 
dcbst rA,rB 


□ Reserved 


31 

0 0 0 0 0 

A 

B 

54 

0 


0 5 6 10 11 15 16 20 21 30 31 

EA is the sum (rAIO) + (rB). 

The dcbst instruction executes as follows: 


• Coherency required (requires the use of address broadcast) 

— Unmodified block — No action in this processor. Signals other processors to copy 
to memory any modified cache block. 

— Modified block — The cache block is written to memory, (only one processor 
should have a copy of a modified block) 

— Absent block — No action in this processor. If a modified copy of the block is in 
the data cache of another processor, the cache line is written to memory. 

• Coherency not required (no address broadcast required) 

— Unmodified block — No action is taken. 

— Modified block — The cache block is written to memory. 

— Absent block — No action is taken. 

NOTE: For modified cache blocks written to memory the architecture does not 

stipulate whether or not to clear the modified state of the cache block. It is 
left up to the processor designer to determine the final state of the cache 
block. Either modified or valid is logically correct. 

The function of this instruction is independent of the write-through and caching- 
inhibited/allowed modes of the block containing the byte addressed by EA. 

The processor treats this instruction as a load from the addressed byte with respect to 
address translation and memory protection. It is also treated as a load for referenced and 
changed bit recording except that referenced and changed bit recording may not occur. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 



X 


8-46 


PowerPC Microprocessor Family: The Programming Environments 












debt 


debt 

Data Cache Block Touch (x’7C00 022C’) 
debt rA,rB 


□ Reserved 


31 

0 0 0 0 0 

A 

B 

278 

0 


0 5 6 10 11 15 16 20 21 30 31 

EA is the sum (rAIO) + (rB). 


This instruction is a hint that performance will possibly be improved if the block containing 
the byte addressed by EA is fetched into the data cache, because the program will probably 
soon load from the addressed byte. If the block is caching-inhibited, the hint is ignored and 
the instruction is treated as a no-op. Executing debt does not cause the system alignment 
error handler to be invoked. 

This instruction is treated as a load from the addressed byte with respect to address 
translation, memory protection, and reference and change recording except that referenced 
and changed bit recording may not occur. Additionally, no exception occurs in the case of 
a translation fault or protection violation. 

The program uses the debt instruction to request a cache block fetch before it is actually 
needed by the program. The program can later execute load instructions to put data into 
registers. However, the processor is not obliged to load the addressed block into the data 
cache. 

NOTE: This instruction is defined architecturally to perform the same functions as the 
debtst instruction. Both are defined in order to allow implementations to 
differentiate the bus actions when fetching into the cache for the case of a load 
and for a store. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 



X 
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dcbtst 


dcbtst 

Data Cache Block Touch for Store (x’7C00 01 EC’) 
dcbtst rA,rB 


□ Reserved 


31 

0 0 0 0 0 

A 

B 

246 

0 


0 5 6 10 11 15 16 20 21 30 31 


EA is the sum (rAIO) + (rB). 

This instruction is a hint that performance will possibly be improved if the block containing 
the byte addressed by EA is fetched into the data cache, because the program will probably 
soon store from the addressed byte. If the block is caching-inhibited, the hint is ignored and 
the instruction is treated as a no-op. Executing dcbtst does not cause the system alignment 
error handler to be invoked. 

This instruction is treated as a load from the addressed byte with respect to address 
translation, memory protection, and reference and change recording except that referenced 
and changed bit recording may not occur. Additionally, no exception occurs in the case of 
a translation fault or protection violation. 

The program uses dcbtst to request a cache block fetch to potentially improve performance 
for a subsequent store to that EA, as that store would then be to a cached location. However, 
the processor is not obliged to load the addressed block into the data cache. 

NOTE: This instruction is defined architecturally to perform the same functions as the 
debt instruction. Both are defined in order to allow implementations to 
differentiate the bus actions when fetching into the cache for the case of a load 
and for a store. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 



X 
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dcbz 


dcbz 

Data Cache Block Clear to Zero (x’7C00 07EC’) 
dcbz rA,rB 


□ Reserved 


31 

0 0 0 0 0 

A 

B 

1014 

0 


0 5 6 10 11 15 16 20 21 30 31 


EA is the sum (rAIO) + (rB). 

This instruction is treated as a store to the addressed byte with respect to address 
translation, memory protection, referenced and changed recording. It is also treated as a 
store with respect to the ordering enforced by eieio and the ordering enforced by the 
combination of caching-inhibited and guarded attributes for a page (or block). 

The dcbz instruction executes as follows: 

• If the cache block containing the byte addressed by EA is in the data cache, all bytes 
are cleared and the cache line is marked “M”. 

• If the cache block containing the byte addressed by EA is not in the data cache and 
the corresponding memory page or block is caching-allowed, the cache block is 
allocated (and made valid) in the data cache without fetching the block from main 
memory, and all bytes are cleared. 

• If the page containing the byte addressed by EA is in caching-inhibited or write- 
through mode, either all bytes of main memory that correspond to the addressed 
cache block are cleared or the alignment exception handler is invoked. The 
exception handler can then clear all bytes in main memory that correspond to the 
addressed cache block. 

• If the cache block containing the byte addressed by EA is in coherency-required 
mode, and the cache block exists in the data cache(s) of any other processor(s), it is 
kept coherent in those caches (i.e. the processor performs the appropriate bus 
transactions to enforce this). 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

V EA 



X 
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divw* 


divw* 

Divide Word (x’7C00 03D6’) 

(OE = 0 Rc = 0) 
(OE = ORc = 1) 
(OE = 1 Rc = 0) 
(OE = 1 Rc = 1) 


divw 

rD,rA,rB 

divw. 

rD,rA,rB 

divwo 

rD,rA,rB 

divwo. 

rD,rA,rB 


31 

D 

A 

B 

OE 

491 

Rc 


0 5 6 10 11 15 16 20 21 22 30 31 


dividends— (rA) 
divisor <— (rB) 
rD <— dividend -r divisor 

The dividend is the contents of rA. The divisor is the contents of rB. The remainder is not 
supplied as a result. Both the operands and the quotient are interpreted as signed integers. 
The quotient is the unique signed integer that satisfies the equation — dividend = (quotient 
* divisor) + r where 0 r < Idivisorl (if the dividend is non-negative), and -Idivisorl < r 0 (if 
the dividend is negative). 

If an attempt is made to perform either of the divisions — 0x8000_0000 - 1 or 
<anything> -r 0, then the contents of rD are undefined, as are the contents of the LT, GT, 
and EQ bits of the CR0 field (if Rc = 1). In this case, if OE = 1 then OV is set. 

The 32-bit signed remainder of dividing the contents of rA by the contents of rB can be 
computed as follows, except in the case that the contents of rA = -2 31 and the contents of 
rB =-l. 


divw rD,rA,rB 

mullw rD,rD,rB 

subf rD,rD,rA 

# rD = quotient 

# rD = quotient * divisor 

# rD = remainder 

Other registers altered: 


• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO 

S5 

& 

o 

II 

• XER: 


Affected: SO, OV 

r" H 

II 

W 

0 

1 1 


NOTE: For more information on condition codes see Section 2.1.3, “Condition 


Register,” and Section 2.1.5, “XER Register.” 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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divwux 

Divide Word Unsigned (x’7C00 0396’) 


divwu 

rD,rA,rB 

(OE = 

divwu. 

rD,rA,rB 

(OE = 

divwuo 

rD,rA,rB 

(OE = 

divwuo. 

rD,rA,rB 

(OE = 


divwux 


0 Rc = 0) 

0 Rc = 1) 

1 Rc = 0) 
1 Rc = 1) 


31 

D 

A 

B 

OE 

459 

Rc 


0 5 6 10 11 15 16 20 21 22 30 31 


dividend 4— (rA) 

divisor 4— (rB) 

rD4— dividend -r divisor 

The dividend is the contents of rA. The divisor is the contents of rB. The remainder is not 
supplied as a result. 

Both operands and the quotient are interpreted as unsigned integers, except that if Rc = 1 
the first three bits of CR0 field are set by signed comparison of the result to zero. The 
quotient is the unique unsigned integer that satisfies the equation — dividend = (quotient * 
divisor) + r (where 0 r < divisor). If an attempt is made to perform the 
division — <anything> 0 — then the contents of rD are undefined as are the contents of the 
LT, GT, and EQ bits of the CR0 field (if Rc = 1). In this case, if OE = 1 then OV is set. 

The 32-bit unsigned remainder of dividing the contents of rA by the contents of rB can be 
computed as follows: 

divwu rD,rA,rB # rD = quotient 

mullw rD,rD,rB # rD = quotient * divisor 

subf rD,rD,rA # rD = remainder 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 

• XER: 

Affected: SO, OV ( if OE = 1) 

NOTE: For more information on condition codes see Section 2.1.3, “Condition 
Register,” and Section 2.1.5, “XER Register.” 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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eciwx 


eciwx 

External Control In Word Indexed (x’7C00 026C’) 
eciwx rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

310 

0 


0 5 6 10 11 15 16 20 21 30 31 


The eciwx instruction and the EAR register can be very efficient when mapping special 
devices such as graphics devices that use addresses as pointers. 

if rA = 0 

then b 4— 0 
else b4— (rA) 

EA 4- b + (rB) 

paddr 4— address translation of EA 

send load word request for paddr to device identified by EAR [RID] 
rD 4— word from device 

EA is the sum (rAIO) + (rB). 

A load word request for the physical address (referred to as real address in the architecture 
specification) corresponding to EA is sent to the device identified by EAR[RID], bypassing 
the cache. The word returned by the device is placed in rD. 

EAR[E] must be 1 . If it is not, a DSI exception is generated. 

EA must be a multiple of four. If it is not, one of the following occurs: 

• A system alignment exception is generated. 

• A DSI exception is generated (possible only if EAR[E] = 0). 

• The results are boundedly undefined. 

The eciwx instruction is supported for EAs that reference memory segments in which 
SR[T] = 1 and for EAs mapped by the DBAT registers. If the EA references a direct-store 
segment (SR[T] = 1), either a DSI exception occurs or the results are boundedly undefined. 

NOTE: The direct-store facility is being phased out of the architecture and will not likely 
be supported in future devices. Thus, software should not depend on its effects. 

If this instruction is executed when MSR[DR] = 0 (real addressing mode), the results are 
boundedly undefined. 

This instruction is treated as a load from the addressed byte with respect to address 
translation, memory protection, referenced and changed bit recording, and the ordering 
performed by eieio. 
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NOTE: This instruction is optional in the PowerPC architecture. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 


V 

X 
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ecowx 

External Control Out Word Indexed (x’7C00 036C’) 


ecowx 


ecowx 


rS,rA,rB 


□ Reserved 


31 

S 

A 

B 

438 

0 


0 5 6 10 11 15 16 20 21 30 31 


The ecowx instruction and the EAR register can be very efficient when mapping special 
devices such as graphics devices that use addresses as pointers. 

if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + (rB) 

paddr 4— address translation of EA 

send store word request for paddr to device identified by EAR [RID] 
send rS to device 

EA is the sum (rAIO) + (rB). 

A store word request for the physical address corresponding to EA and the contents of rS 
are sent to the device identified by EAR[RID], bypassing the cache. 

EAR[E] must be 1, if it is not, a DSI exception is generated. 

EA must be a multiple of four. If it is not, one of the following occurs: 

• A system alignment exception is generated. 

• A DSI exception is generated (possible only if EAR[E] = 0). 

• The results are boundedly undefined. 

The ecowx instruction is supported for effective addresses that reference memory segments 
in which SR[T] = 0, and for EAs mapped by the DBAT registers. If the EA references a 
direct-store segment (SR[T] = 1), either a DSI exception occurs or the results are boundedly 
undefined. 

NOTE: The direct-store facility is being phased out of the architecture and will not likely 
be supported in future devices. Thus, software should not depend on its effects. 

If this instruction is executed when MSR[DR] = 0 (real addressing mode), the results are 
boundedly undefined. 

This instruction is treated as a store from the addressed byte with respect to address 
translation, memory protection, and referenced and changed bit recording, and the ordering 
performed by eieio. 
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NOTE: 


Software synchronization is required in order to ensure that the data access is 
performed in program order with respect to data accesses caused by other store 
or ecowx instructions, even though the addressed byte is assumed to be caching- 
inhibited and guarded. 

NOTE: This instruction is optional in the PowerPC architecture. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 


V 

X 
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eieio 

Enforce In-Order Execution of I/O (x’7C00 06AC’) 


eieio 


j | Reserved 


31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

854 

0 


0 5 6 10 11 15 16 20 21 30 31 


The eieio instruction provides an ordering function for the effects of load and store 
instructions executed by a processor. These loads and stores are divided into two sets, which 
are ordered separately. The memory accesses caused by a dcbz or a dcba instruction are 
ordered like a store. The two sets follow: 

1 . Loads and stores to memory that is both caching-inhibited and guarded, and stores 
to memory that is write-through required. 

The eieio instruction controls the order in which the accesses are performed in main 
memory. It ensures that all applicable memory accesses caused by instructions 
preceding the eieio instruction have completed with respect to main memory before 
any applicable memory accesses caused by instructions following the eieio 
instruction access main memory. It acts like a barrier that flows through the memory 
queues and to main memory, preventing the reordering of memory accesses across 
the barrier. No ordering is performed for dcbz if the instruction causes the system 
alignment error handler to be invoked. 

All accesses in this set are ordered as a single set — that is, there is not one order for 
loads and stores to caching-inhibited and guarded memory and another order for 
stores to write-through required memory. 

2. Stores to memory that have all of the following attributes — caching-allowed, write- 
through not required, and memory-coherency required. 

The eieio instruction controls the order in which the accesses are performed with 
respect to coherent memory. It ensures that all applicable stores caused by 
instructions preceding the eieio instruction have completed with respect to coherent 
memory before any applicable stores caused by instructions following the eieio 
instruction complete with respect to coherent memory. 

With the exception of dcbz and dcba, eieio does not affect the order of cache operations 
(whether caused explicitly by execution of a cache management instruction, or implicitly 
by the cache coherency mechanism). For more information, refer to Chapter 5, “Cache 
Model and Memory Coherency.” The eieio instruction does not affect the order of accesses 
in one set with respect to accesses in the other set. 

The eieio instruction may complete before memory accesses caused by instructions 
preceding the eieio instruction have been performed with respect to main memory or 
coherent memory as appropriate. 
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The eieio instruction is intended for use in managing shared data structures, in accessing 
memory-mapped I/O, and in preventing load/store combining operations in main memory. 
For the first use, the shared data structure and the lock that protects it must be altered only 
by stores that are in the same set (1 or 2; see previous discussion). For the second use, eieio 
can be thought of as placing a barrier into the stream of memory accesses issued by a 
processor, such that any given memory access appears to be on the same side of the barrier 
to both the processor and the I/O device. 

Because the processor performs store operations in order to memory that is designated as 
both caching-inhibited and guarded (refer to Section 5.1.1, “Memory Access Ordering”), 
the eieio instruction is needed for such memory only when loads must be ordered with 
respect to stores or with respect to other loads. 

NOTE: The eieio instruction does not connect hardware considerations to it such as 
multiprocessor implementations that send an eieio address-only broadcast 
(useful in some designs). 

For example, if a design has an external buffer that re-orders loads and stores for 
better bus efficiency, the eieio broadcast signals to that buffer that previous 
loads/stores (marked caching-inhibited, guarded, or write-through required) 
must complete before any following loads/stores (marked caching-inhibited, 
guarded, or write-through required). 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 



X 


Chapter 8. Instruction Set 


8-57 




eqvx 


eqv* 

Equivalent (x’7C00 0238’) 

eqv rA,rS,rB (Rc = 0) 

eqv. rA,rS,rB (Rc = 1) 


31 

S 

A 

B 

284 



0 56 1011 1516 21 22 30 31 


rA <r- (rS) = (rB) 

The contents of rS are XORed with the contents of rB and the complemented result is 
placed into rA. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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extsb* 

Extend Sign Byte (x’7C00 0774’) 


extsb* 

extsb 

rA,rS 

(Re = 0) 


extsb. 

rA,rS 

(Re = 1) 



□ Reserved 


31 

S 

A 

0 0 0 0 0 

954 



0 5 6 10 11 15 16 20 21 30 31 

S 4- rS [24 ] 

rA [24-31 ] <4- rS [24-31] 
rA[0-23] 4- (24 ) S 

The contents of the low-order eight bits of rS are placed into the low-order eight bits of rA. 
Bit 24of rS is placed into the remaining bits of rA. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Re = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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extsh* 

Extend Sign Half Word (x’7C00 0734’) 


extshx 

extsh rA,rS 

extsh. rA,rS 

(Re = 0) 

(Re = 1) 



□ Reserved 


31 

S 

A 

0 0 0 0 0 

922 



0 5 6 10 11 15 16 20 21 30 31 

S <r- rS [ 1 6 ] 

rA [ 16-31 ] <r- rS [16-31] 
rA[0-15] < — (16) S 

The contents of the low-order 16 bits of rS are placed into the low-order 16 bits of rA. Bit 
16 of rS is placed into the remaining bits of rA. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Re = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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fabs* 


fabsx 

Floating Absolute Value (x’FCOO 0210’) 

fabs frD,frB (Rc = 0) 

fabs. frD,frB (Rc = 1) 


□ Reserved 


63 

D 

0 0000 

B 

264 


0 5 6 10 11 15 16 20 21 30 31 

The contents of frB with bit 0 cleared are placed into frD. 

NOTE: The fabs instruction treats NaNs just like any other kind of value. That is, the 
sign bit of a NaN may be altered by fabs. 

This instruction does not alter the FPSCR. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 
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faddx 


faddx 

Floating Add (Double-Precision) (x’FCOO 002A’) 

fadd frD,frA,frB (Rc = 0) 

fadd. frD,frA,frB (Rc = 1) 


□ Reserved 


63 

D 

A 

B 

0 0 0 0 0 

21 



0 5 6 10 11 15 16 20 21 25 26 30 31 


The following operation is performed: 

frD (frA) + (frB) 

The floating-point operand in frA is added to the floating-point operand in frB. If the most- 
significant bit of the resultant significand is not a one, the result is normalized. The result 
is rounded to double-precision under control of the floating-point rounding control field RN 
of the FPSCR and placed into frD. 

Floating-point addition is based on exponent comparison and addition of the two 
significands. The exponents of the two operands are compared, and the significand 
accompanying the smaller exponent is shifted right, with its exponent increased by one for 
each bit shifted, until the two exponents are equal. The two significands are then added or 
subtracted as appropriate, depending on the signs of the operands. All 53 bits in the 
significand as well as all three guard bits (G, R, and X) enter into the computation. 

If a carry occurs, the sum's significand is shifted right one bit position and the exponent is 
increased by one. FPSCR[FPRF] is set to the class and sign of the result, except for invalid 
operation exceptions when FPSCR[ VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX,VXSNAN, VXISI 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



A 
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faddsx 


faddsx 

Floating Add Single (x’ECOO 002A’) 

fadds frD,frA,frB (Rc = 0) 

fadds. frD,frA,frB (Rc = 1) 


| | Reserved 


59 

D 

A 

B 

0 0 0 0 0 

21 



0 5 6 10 11 15 16 20 21 25 26 30 31 


The following operation is performed: 

frD (frA) + (frB) 

The floating-point operand in frA is added to the floating-point operand in frB. If the most- 
significant bit of the resultant significand is not a one, the result is normalized. The result 
is rounded to the single-precision under control of the floating-point rounding control field 
RN of the FPSCR and placed into frD. 

Floating-point addition is based on exponent comparison and addition of the two 
significands. The exponents of the two operands are compared, and the significand 
accompanying the smaller exponent is shifted right, with its exponent increased by one for 
each bit shifted, until the two exponents are equal. The two significands are then added or 
subtracted as appropriate, depending on the signs of the operands. All 53 bits in the 
significand as well as all three guard bits (G, R, and X) enter into the computation. 

If a carry occurs, the sum's significand is shifted right one bit position and the exponent is 
increased by one. FPSCR[FPRF] is set to the class and sign of the result, except for invalid 
operation exceptions when FPSCR[VE] = 1. 


Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX,VXSNAN, VXIS 
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Form 
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fcmpo fcmpo 

Floating Compare Ordered (x’FCOO 0040’) 
fcmpo crfD,frA,frB 


□ Reserved 


63 

crfD 

0 0 

A 

B 

32 

0 


0 5 6 8 9 10 11 15 16 20 21 30 31 


if ( (frA) is a NaN or (frB) is a NaN) 
then c 4— ObOOOl 
else if (frA)< (frB) 

then c 4— OblOOO 
else if (frA)> (frB) 

then c 4— ObOlOO 
else c 4— ObOOlO 

FPCC 4- c 

CR [ ( 4 * crfD) - (4 * crfD + 3)] 4- c 

if ( (frA) is an SNaN or (frB) is an SNaN) 
then VXSNAN 4- 1 
if VE = 0 

then VXVC 4- 1 

else if ( (frA) is a QNaN or (frB) is a QNaN) 
then VXVC 4- 1 


The floating-point operand in frA is compared to the floating-point operand in frB. The 
result of the compare is placed into CR field crfD and the FPCC. 

If one of the operands is a NaN, either quiet or signaling, then CR field crfD and the FPCC 
are set to reflect unordered. If one of the operands is a signaling NaN, then VXSNAN is set, 
and if invalid operation is disabled (VE = 0) then VXVC is set. Otherwise, if one of the 
operands is a QNaN, then VXVC is set. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 

Affected: LT, GT, EQ, UN 

• Floating-Point Status and Control Register: 

Affected: FPCC, FX, VXSNAN, VXVC 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 
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fcmpu fcmpu 

Floating Compare Unordered (x’FCOO 0000’) 
fcmpu crfD,frA,frB 


□ Reserved 
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crfD 

0 0 
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B 

0000000000 
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0 5 6 8 9 10 11 15 16 20 21 30 31 


if ((frA) is a NaN or (frB) is a NaN) 
then c <r- ObOOOl 
else if (frA) < (frB) 

then c <r- 0b 1000 
else if (frA) > (frB) 

then c <r- ObOlOO 
else c <— ObOOlO 

FPCC <- c 

CR[(4 * crfD)-(4 * crfD + 3)] <- c 

if ((frA) is an SNaN or (frB) is an SNaN) 
then VXSNAN <- 1 

The floating-point operand in register frA is compared to the floating-point operand in 
register frB. The result of the compare is placed into CR field crfD and the FPCC. 

If one of the operands is a NaN, either quiet or signaling, then CR field crfD and the FPCC 
are set to reflect unordered. If one of the operands is a signaling NaN, then VXSNAN is set. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 

Affected: LT, GT, EQ, UN 

• Floating-Point Status and Control Register: 

Affected: FPCC, FX, VXSNAN 


PowerPC Architecture Level 

Supervisor Level 
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Form 
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fctiw* 


fctiwx 

Floating Convert to Integer Word (x’FCOO 001 C’) 

fctiw frD,frB (Rc = 0) 

fctiw. frD,frB (Rc = 1) 


□ Reserved 


63 

D 

0 0000 
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The floating-point operand in register frB is converted to a 32-bit signed integer, using the 
rounding mode specified by FPSCR[RN], and placed in bits 32-63 of frD. Bits 0-31 of frD 
are undefined. 

If the operand in frB are greater than 2 31 — 1, bits 32-63 of frD are set to 0x7FFF_FFFF. 

If the operand in frB are less than -2 31 , bits 32-63 of frD are set to 0x8000_0000. 

The conversion is described fully in Section D.4.2, “Floating-Point Convert to Integer 
Model.” 

Except for trap-enabled invalid operation exceptions, FPSCR[FPRF] is undefined. 
FPSCR[FR] is set if the result is incremented when rounded. FPSCR[FI] is set if the result 
is inexact. 

(Programmers note: A stfiwz instruction should be used to store the 32 bit resultant integer 
because bits 0-31 of frD are undefined. A store double-precision instruction, e.g., stfdx, 
will store the 64 bit result but 4 superfluous bytes are stored (bits frD[0-31]). This may 
cause wasted bus traffic.) 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF (undefined), FR, FI, FX, XX, VXSNAN, VXCVI 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 
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fctiwzx fctiwz* 

Floating Convert to Integer Word with Round toward Zero (x’FCOO 001 E’) 

fctiwz frD,frB (Rc = 0) 

fctiwz. frD,frB (Rc = 1) 


□ Reserved 
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The floating-point operand in register frB is converted to a 32-bit signed integer, using the 
rounding mode round toward zero, and placed in bits 32-63 of frD. Bits 0-31 of frD are 
undefined. 

If the operand in frB is greater than 2 31 — 1, bits 32-63 of frD are set to 0x7FFF_FFFF. 

If the operand in frB is less than -2 31 , bits 32-63 of frD are set to Ox 8000_0000. 

The conversion is described fully in Section D.4.2, “Floating-Point Convert to Integer 
Model.” 

Except for trap-enabled invalid operation exceptions, FPSCR[FPRF] is undefined. 
FPSCR[FR] is set if the result is incremented when rounded. FPSCR[FI] is set if the result 
is inexact. 

(Programmers note: A stfiwx instruction should be used to store the 32 bit resultant integer 
because bits 0-31 of frD are undefined. A store double-precision instruction, e.g., stfdx, 
will store the 64 bit result but 4 superfluous bytes are stored (bits frD[0-31]). This may 
cause wasted bus traffic.) 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF (undefined), FR, FI, FX, XX, VXSNAN, VXCVI 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 
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fdiv* 


fdivx 

Floating Divide (Double-Precision), (x’FCOO 0024’) 

fdiv frD,frA,frB (Rc = 0) 

fdiv. frD,frA,frB (Rc = 1) 


□ Reserved 
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The floating-point operand in register frA is divided by the floating-point operand in 
register frB. The remainder is not supplied as a result. 

If the most- significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to double-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

Floating-point division is based on exponent subtraction and division of the significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 and zero divide exceptions when FPSCR[ZE] = 1. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, ZX, XX, VXSNAN, VXIDI, VXZDZ 


PowerPC Architecture Level 

Supervisor Level 
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Form 
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fdivsx 

Floating Divide Single (x’ECOO 0024’) 

fdivs frD,frA,frB (Rc = 0) 

fdivs. frD,frA,frB (Rc = 1) 


□ Reserved 
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The floating-point operand in register frA is divided by the floating-point operand in 
register frB. The remainder is not supplied as a result. 

If the most- significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to single-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

Floating-point division is based on exponent subtraction and division of the significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 and zero divide exceptions when FPSCR[ZE] = 1. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, ZX, XX, VXSNAN, VXIDI, VXZDZ 


fdivsx 
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fmaddx 


fmaddx 

Floating Multiply-Add (Double-Precision), (x’FCOO 003A’) 

fmadd frD,frA,frC,frB (Rc = 0) 

fmadd. frD,frA,frC,frB (Rc = l) 
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The following operation is performed: 

frD ((fra) * (frC) ) + (frB) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is added to this intermediate result. 

If the most-significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to double-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 
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fmaddsx 


fmaddsx 

Floating Multiply-Add Single (x’ECOO 003A’) 
fmadds frD,fr A,frC,frB (Rc = 0) 


fmadds. 

frD, frA, frC, frB 

(Rc = 1) 
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The following operations are performed: 


frD <- ( (frA) * (frC) ) + (frB) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is added to this intermediate result. 

If the most-significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to single-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 


PowerPC Architecture Level 
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PowerPC Optional 

Form 

UISA 
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fmrx 


fmrx 

Floating Move Register(Double-Precision),(x’FCOO 0090’) 

fmr frD,frB (Rc = 0) 

fmr. frD,frB (Rc = 1) 


□ Reserved 
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The following operation is performed: 


frD f- (frB) 

The content of register frB is placed into frD. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 
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Form 
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fmsub* 


fmsub* 

Floating Multiply-Subtract (Double-Precision), (x’FCOO 0038’) 
fmsub frD,frA,frC,frB (Rc = 0) 


fmsub. 

frD, frA, frC, frB 

(Rc = 1) 
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The following operation is performed: 

frD <— [(frA)* (frC)] - (frB) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is subtracted from this intermediate 
result. 

If the most- significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to double-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 
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Supervisor Level 

PowerPC Optional 

Form 

UISA 
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fmsubsx 


fmsubsx 

Floating Multiply-Subtract Single (x’ECOO 0038’) 

fmsubs frD,frA,frC,frB (Rc = 0) 

fmsubs. frD,frA,frC,frB (Rc = l) 
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The following operations are performed: 

frD [ (frA) * (frC) ] - (frB) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is subtracted from this intermediate 
result. 

If the most- significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to single-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 
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fmulx 


fmulx 

Floating Multiply (Double-Precision), (x’FCOO 0032’) 

fmul frD,frA,frC (Rc = 0) 

fmul. frD,frA,frC (Rc = 1) 


□ Reserved 
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The following operation is performed: 


frD f- (frA) * (frC) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. 

If the most-significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to double-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

Floating-point multiplication is based on exponent addition and multiplication of the 
significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXIMZ 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 
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fmulsx 

Floating Multiply Single (x’ECOO 0032’) 

fmuls frD,frA,frC (Rc = 0) 

fmuls. frD,frA,frC (Rc = 1) 


| | Reserved 
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The following operation is performed: 


frD f- (frA) * (frC) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. 

If the most-significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to single-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

Floating-point multiplication is based on exponent addition and multiplication of the 
significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXIMZ 


fmulsx 
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Form 
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fnabs* 



fnabs* 

Floating Negative Absolute Value (x’FCOO 0110’) 


fnabs 

frD, frB 

(Re = 0) 


fnabs. 

frD, frB 

(Re = 1) 



□ Reserved 
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The following operation is performed: 


frD <- 1 | | frB [ 1 - 63 ] 

The contents of register frB with bit 0 set are placed into frD. 

NOTE: The fnabs instruction treats NaNs just like any other kind of value. That is, the 
sign bit of a NaN may be altered by fnabs. This instruction does not alter the 
FPSCR. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Re = 1) 
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Form 
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fnegx 


fnegx 

Floating Negate (x’FCOO 0050’) 

fneg frD,frB (Rc = 0) 

fneg. frD,frB (Rc = 1) 


□ Reserved 
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The following operation is performed: 


f rD <r- — i f rB [ 0 ] || frB[l-63] 

The contents of register frB with bit 0 inverted are placed into frD. 

NOTE: The fneg instruction treats NaNs just like any other kind of value. That is, the 
sign bit of a NaN may be altered by fneg. This instruction does not alter the 
FPSCR. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 
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X 


8-78 


PowerPC Microprocessor Family: The Programming Environments 









fnmadcF 


fnmaddx 

Floating Negative Multi ply- Add (Double-Precision), (x’FCOO 003E’) 

fnmadd frD,frA,frC,frB (Rc = 0) 

fnmadd. frD,frA,frC,frB (Rc = 1) 
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The following operations are performed: 

frD <• ( [ (frA) * (frC) ] + (frB) ) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is added to this intermediate result. 
If the most-significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to double-precision under control of the floating-point rounding 
control field RN of the FPSCR, then negated and placed into frD. 

This instruction produces the same result as would be obtained by using the Floating 
Multiply-Add (fmadcLr) instruction and then negating the result, with the following 
exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid operation exception have 
a sign bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled invalid operation 
exception retain the sign bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 


PowerPC Architecture Level 
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fnmaddsx 


fnmaddsx 

Floating Negative Multi ply- Add Single (x’ECOO 003E’) 

fnmadds frD,frA,frC,frB (Rc = 0) 

fnmadds. frD,frA,frC,frB (Rc = 1) 
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The following operations are performed: 

frD < ( [ (frA) * (frC) ] + (frB) ) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is added to this intermediate result. 
If the most-significant bit of the resultant significand is not a one, the result is normalized. 
The result is rounded to single-precision under control of the floating-point rounding 
control field RN of the FPSCR, then negated and placed into frD. 

This instruction produces the same result as would be obtained by using the Floating 
Multiply- Add Single (fmaddsx) instruction and then negating the result, with the following 
exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid operation exception have 
a sign bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled invalid operation 
exception retain the sign bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 
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fnmsubx fnmsubx 

Floating Negative Multiply-Subtract (Double-Precision), (x’FCOO 003C’) 
fnmsub frD,frA,frC,frB (Rc = 0) 


fnmsub. 

frD, frA, frC, frB 

(Rc = 1) 

1 
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The following operations are performed: 

frD < ( [ (frA) * (frC) ] - (frB) ) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is subtracted from this intermediate 
result. 

If the most-significant bit of the resultant significand is not one, the result is normalized. 
The result is rounded to double-precision under control of the floating-point rounding 
control field RN of the FPSCR, then negated and placed into frD. 

This instruction produces the same result obtained by negating the result of a Floating 
Multiply-Subtract (fmsubx) instruction with the following exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid operation exception have 
a sign bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled invalid operation 
exception retain the sign bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field) 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 
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fnmsubsx 


fnmsubsx 

Floating Negative Multiply-Subtract Single (x’ECOO 003C’) 

fnmsubs frD,frA,frC,frB (Rc = 0) 

fnmsubs. frD,frA,frC,frB (Rc = 1) 

) 
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The following operations are performed: 

frD < ( [ (frA) * (frC) ] - (frB) ) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is subtracted from this intermediate 
result. 

If the most-significant bit of the resultant significand is not one, the result is normalized. 
The result is rounded to single-precision under control of the floating-point rounding 
control field RN of the FPSCR, then negated and placed into frD. 

This instruction produces the same result obtained by negating the result of a Floating 
Multiply-Subtract Single (fmsubsx) instruction with the following exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid operation exception have 
a sign bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled invalid operation 
exception retain the sign bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field) 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



A 
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fresx 


fresx 

Floating Reciprocal Estimate Single (x’ECOO 0030’) 

fres frD,frB (Rc = 0) 

fres. frD,frB (Rc = 1) 


j | Reserved 


59 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

24 



0 5 6 10 11 15 16 20 21 25 26 30 31 

The following operation is performed: 


frD <— estimate [1/ (frB) ] 

A single-precision estimate of the reciprocal of the floating-point operand in register frB is 
placed into register frD. The estimate placed into register frD is correct to a precision of 
one part in 4096 of the reciprocal of frB. That is, 


ABS, 


/ . mi 

estunate-\ 

x 


1 


(4096) 


where x is the initial value in frB. 

NOTE: The value placed into register frD may vary between implementations, and 
between different executions on the same implementation. 

Operation with various special values of the operand is summarized below: 


Oncrantl 

Result 

Exception 

- 

-0 

None 

-0 

_ * 

ZX 

+0 

+ * 

zx 

+ 

+0 

None 

SNaN 

QNaN** 

VXSNAN 

QNaN 

QNaN 

None 


Notes: * No result if FPSCR[ZE] = 1 

** No result if FPSCR[VE] = 1 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 and zero divide exceptions when FPSCR[ZE] = 1. 
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NOTE: The PowerPC architecture makes no provision for a double-precision version of 
the fresv instruction. This is because graphics applications are expected to need 
only the single-precision version, and no other important performance-critical 
applications are expected to require a double-precision version of the fresv 
instruction. 

NOTE: This instruction is optional in the PowerPC architecture. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Re = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR (undefined), FI (undefined), FX, OX, UX, ZX, VXSNAN 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 


V 

A 
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(Rc = 0) 
(Rc = 1) 


frsp* 

Floating Round to Single (x’FCOO 0018’) 

frsp frD,frB 

frsp. frD,frB 


frspx 


□ Reserved 


63 

D 

0 0 0 0 0 

B 

12 
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The following operation is performed: 

frD <r- Round_s ingle ( frB ) 

If it is already in single-precision range, the floating-point operand in register frB is placed 
into frD. Otherwise, the floating-point operand in register frB is rounded to single- 
precision using the rounding mode specified by FPSCR[RN] and placed into frD. 

The rounding is described fully in Section D.4.1, “Floating-Point Round to Single- 
Precision Model.” 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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frsqrtex 


frsqrte* 

Floating Reciprocal Square Root Estimate (x’FCOO 0034’) 

frsqrte frD,frB (Rc = 0) 

frsqrte. frD,frB (Rc = 1) 


I | Reserved 


63 

D 

0 0000 

B 

000 00 

26 
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A double-precision estimate of the reciprocal of the square root of the floating-point 
operand in register frB is placed into register frD. The estimate placed into register frD is 
correct to a precision of one part in 4096 of the reciprocal of the square root of frB. That is. 


f .. . ( 1 A 

estimate 


frD <- 


ABS 


'4096 


where x is the initial value in frB. 

NOTE: The value placed into register frD may vary between implementations, and 
between different executions on the same implementation. 

Operation with various special values of the operand is summarized below: 


Onerand 

Result 

Execution 

- 

QNaN** 

VXSQRT 

<0 

QNaN** 

VXSQRT 

-0 

_ * 

ZX 

+0 

+ * 

ZX 

+ 

+0 

None 

SNaN 

QNaN** 

VXSNAN 

QNaN 

QNaN 

None 


Notes: * No result if FPSCR[ZE] = 1 

** No result if FPSCR[VE] = 1 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 and zero divide exceptions when FPSCR[ZE] = 1. 

NOTE: No single-precision version of the frsqrte instruction is provided; however, both 
frB and frD are representable in single-precision format. 

NOTE: This instruction is optional in the PowerPC architecture. 
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Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Re = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR (undefined), FI (undefined), FX, ZX, VXSNAN, VXSQRT 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 


V 

A 
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fsel* 


fsel* 

Floating Select (x’FCOO 002E’) 

fsel frD,frA,frC,frB (Rc = 0) 


fsel. 

frD, frA, frC, frB 

(Rc = 1) 




63 

D A 

B 

C 

23 
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if (frA) >0.0 

then frD (frC) 
else frD <— (frB) 

The floating-point operand in register frA is compared to the value zero. If the operand is 
greater than or equal to zero, register frD is set to the contents of register frC. If the operand 
is less than zero or is a NaN, register frD is set to the contents of register frB. The 
comparison ignores the sign of zero (that is, regards +0 as equal to -0). 

Care must be taken in using fsel if IEEE compatibility is required, or if the values being 
tested can be NaNs or infinities. 

For examples of uses of this instruction, see Section D.3, “Floating-Point Conversions,” 
and Section D.5, “Floating-Point Selection.” 

NOTE: This instruction is optional in the PowerPC architecture. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1 ) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 


V 

A 
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fsqrt* 


fsqrt* 

Floating Square Root(Double-Precision),(x’FCOO 002C’) 

fsqrt frD,frB (Rc = 0) 

fsqrt. frD,frB (Rc = 1) 

j | Reserved 


63 

D 

0 0000 

B 

000 00 

22 
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The following operation is performed: 

frD <— Square_root ( frB ) 

The square root of the floating-point operand in register frB is placed into register frD. 

If the most-significant bit of the resultant significand is not one, the result is normalized. 
The result is rounded to double-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

Operation with various special values of the operand is summarized below: 


Operand 

Result 

Exception 

- 

QNaN* 

VXSQRT 

<0 

QNaN* 

VXSQRT 

-0 

-0 

None 

+ 

+ 

None 

SNaN 

QNaN* 

VXSNAN 

QNaN 

QNaN 

None 

Notes: * No result if FPSCR[VE] 

= 1 



FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1. 

NOTE: This instruction is optional in the PowerPC architecture. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 


Affected: FPRF, FR, FI, FX, ZX, VXSNAN, VXSQR 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 


V 

A 
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fsqrts* 


fsqrtsx 

Floating Square Root(Single-Precision),(x’ECOO 002C’) 

fsqrts frD,frB (Rc = 0) 

fsqrts. frD,frB (Rc = 1) 

j | Reserved 


59 

D 

0 0000 

B 

000 00 

22 
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The following operation is performed: 

frD <— Square_root ( frB ) 

The square root of the floating-point operand in register frB is placed into register frD. 

If the most-significant bit of the resultant significand is not one, the result is normalized. 
The result is rounded to single-precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into frD. 

Operation with various special values of the operand is summarized below: 


Operand 

Result 

Exception 

- 

QNaN* 

VXSQRT 

<0 

QNaN* 

VXSQRT 

-0 

-0 

None 

+ 

+ 

None 

SNaN 

QNaN* 

VXSNAN 

QNaN 

QNaN 

None 

Notes: * No result if FPSCR[VE] 

= 1 



FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1. 

NOTE: This instruction is optional in the PowerPC architecture. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 


Affected: FPRF, FR, FI, FX, ZX, VXSNAN, VXSQRT 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 


V 

A 
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fsubx 


fsubx 

Floating Subtract (Double-Precision), (x’FCOO 0028’) 

fsub frD,frA,frB (Rc = 0) 

fsub. frD,frA,frB (Rc = 1) 


□ Reserved 
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B 
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20 
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The following operation is performed: 


frD A- (frA) - (frB) 

The floating-point operand in register frB is subtracted from the floating-point operand in 
register frA. If the most- significant bit of the resultant significand is not a one, the result is 
normalized. The result is rounded to double-precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into frD. 

The execution of the fsub instruction is identical to that of fadd, except that the contents of 
frB participate in the operation with its sign bit (bit 0) inverted. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



A 


Chapter 8. Instruction Set 


8-91 











fsubs* 

Floating Subtract Single (x’ECOO 0028’) 


fsubs* 

fsubs frD, frA, frB 

fsubs. frD, frA, frB 

(Re = 0) 

(Re = 1) 



□ Reserved 
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B 
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The following operation is performed: 


frD f- (frA) - (frB) 

The floating-point operand in register frB is subtracted from the floating-point operand in 
register frA. If the most- significant bit of the resultant significand is not a one, the result is 
normalized. The result is rounded to single-precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into frD. 

The execution of the fsubs instruction is identical to that of fadds, except that the contents 
of frB participate in the operation with its sign bit (bit 0) inverted. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE] = 1 . 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (if Re = 1) 

• Floating-Point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



A 
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icbi 


icbi 

Instruction Cache Block Invalidate (x’7C00 07AC’) 
icbi rA,rB 


□ Reserved 


31 

0 0 0 0 0 

A 

B 

982 

0 


0 5 6 10 11 15 16 20 21 30 31 


EA is the sum (rAIO) + (rB). 

If the block containing the byte addressed by EA is in coherency-required mode, and a 
block containing the byte addressed by EA is in the instruction cache of any processor, the 
block is made invalid in all such instruction caches, so that subsequent references cause the 
block to be refetched. 

If the block containing the byte addressed by EA is in coherency-not-required mode, and a 
block containing the byte addressed by EA is in the instruction cache of this processor, the 
block is made invalid in that instruction cache, so that subsequent references cause the 
block to be refetched. 

The function of this instruction is independent of the write-through, write-back, and 
caching-inhibited/allowed modes of the block containing the byte addressed by EA. 

This instruction is treated as a load from the addressed byte with respect to address 
translation and memory protection. It may also be treated as a load for referenced and 
changed bit recording except that referenced and changed bit recording may not occur. 
Implementations with a combined data and instruction cache treat the icbi instruction as a 
no-op, except that they may invalidate the target block in the instruction caches of other 
processors if the block is in coherency-required mode. 

The icbi instruction invalidates the block at EA (rAIO + rB). If the processor is a 
multiprocessor implementation (for example, the 601, 604, or 620) and the block is marked 
coherency-required, the processor will send an address-only broadcast to other processors 
causing those processors to invalidate the block from their instruction caches. 

For faster processing, many implementations will not compare the entire EA (rAIO + rB) 
with the tag in the instruction cache. Instead, they will use the bits in the EA to locate the 
set that the block is in, and invalidate all blocks in that set. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

V EA 



X 
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isync isync 

Instruction Synchronize (x’4C00 01 2C’) 

isync 


□ Reserved 


19 

0 0 0 0 0 

0 0000 

0 0 0 0 0 

150 

0 


0 5 6 10 11 15 16 20 21 30 31 


The isync instruction provides an ordering function for the effects of all instructions 
executed by a processor. Executing an isync instruction ensures that all instructions 
preceding the isync instruction have completed before the isync instruction completes, 
except that memory accesses caused by those instructions need not have been performed 
with respect to other processors and mechanisms. It also ensures that no subsequent 
instructions are initiated by the processor until after the isync instruction completes. 
Finally, it causes the processor to discard any prefetched instructions, with the effect that 
subsequent instructions will be fetched and executed in the context established by the 
instructions preceding the isync instruction. The isync instruction has no effect on the other 
processors or on their caches. 

This instruction is context synchronizing. 

Context synchronization is necessary after certain code sequences that perform complex 
operations within the processor. These code sequences are usually operating system tasks 
that involve memory management. For example, if an instruction A changes the memory 
translation rules in the memory management unit (MMU), the isync instruction should be 
executed so that the instructions following instruction A will be discarded from the pipeline 
and refetched according to the new translation rules. 

NOTE: All exceptions and rfi and sc instructions are also context synchronizing. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 



XL 
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Ibz 


Ibz 

Load Byte and Zero (x’8800 0000’) 
Ibz rD,d(rA) 


34 

D 

A 

d 


0 56 1011 15 16 31 


if rA = 0 

then b 4— 0 
else b 4- (rA) 

EA 4— b + EXTS (d) 

rD f- (24) 0 || MEM (EA, 1) 


EA is the sum (rAIO) + d. The byte in memory addressed by EA is loaded into the low-order 
eight bits of rD. The remaining bits in rD are cleared. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Ibzu Ibzu 

Load Byte and Zero with Update (x’8C00 0000’) 

Ibzu rD,d(rA) 


35 

D 

A 

d 


0 5 6 10 11 15 16 31 

EA i — (rA) + EXTS (d) 
rD 4— (24)0 || MEM (EA, 1) 
rA 4- EA 

EA is the sum (rA) + d. The byte in memory addressed by EA is loaded into the low-order 
eight bits of rD. The remaining bits in rD are cleared. 

EA is placed into rA. 

If rA = 0, or rA = rD, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Ibzux 


Ibzux 

Load Byte and Zero with Update Indexed (x’7C00 OOEE’) 
Ibzux rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

119 

0 


0 5 6 10 11 15 16 20 21 30 31 


EA <r- (rA) + (rB) 

rD A (24) 0 || MEM (EA, 1) 

rA i — EA 


EA is the sum (rA) + (rB). The byte in memory addressed by EA is loaded into the low- 
order eight bits of rD. The remaining bits in rD are cleared. 

EA is placed into rA. 

If rA = 0 , or rA = rD, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 
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Ibzx 

Load Byte and Zero Indexed (x’7C00 OOAE’) 
lbzx rD,rA,rB 



□ Reserved 


31 

D 

A 

B 

87 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b 4- 0 
else b (rA) 

EA b + (rB) 

rD f- (24) 0 || MEM (EA, 1) 


EA is the sum (rAIO) + (rB). The byte in memory addressed by EA is loaded into the low- 
order eight bits of rD. The remaining bits in rD are cleared. 

Other registers altered: 

None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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ltd 


Ifd 

Load Floating-Point Double (x’C800 0000’) 
lfd frD,d(rA) 

50 D A 

0 5 6 10 11 15 16 31 

if rA = 0 

then b <— 0 
else b (rA) 

EA <r- b + EXTS (d) 
frD <r- MEM (EA, 8) 


EA is the sum (rAIO) + d. 

The double word in memory addressed by EA is placed into frD. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Ifdu Ifdu 

Load Floating-Point Double with Update (x’CCOO 0000’) 

Ifdu frD,d(rA) 


51 

D 

A 

d 


0 5 6 10 11 15 16 31 

EA 4- (rA) + EXTS (d) 
frD 4- MEM (EA, 8) 
rA 4- EA 

EA is the sum (rA) + d. 

The double word in memory addressed by EA is placed into frD. 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Ifdux 


Ifdux 

Load Floating-Point Double with Update Indexed (x’7C00 04EE’) 
Ifdux frD,rA,rB 


□ Reserved 


31 

D 

A 

B 

631 

0 


0 5 6 10 11 15 16 20 21 30 31 

EA 4- (rA) + (rB) 
frD 4— MEM (EA, 8) 
rA 4- EA 


EA is the sum (rA) + (rB). 

The double word in memory addressed by EA is placed into frD. 
EA is placed into rA. 

If rA = 0 , the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Ifdx 


Ifdx 

Load Floating-Point Double Indexed (x’7C00 04AE’) 
Ifdx frD,rA,rB 


□ Reserved 


31 

D 

A 

B 

599 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b <— 0 
else b (rA) 
EA f- b + (rB) 
frD f- MEM (EA, 8) 


EA is the sum (rAIO) + (rB). 

The double word in memory addressed by EA is placed into frD. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Load Floating-Point Single (x’COOO 0000’) 


lfs 


frD,d(rA) 


48 

D 

A 

d 
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if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + EXTS (d) 

frD 4- DOUBLE (MEM (EA, 4)) 


EA is the sum (rA) + d. 

The word in memory addressed by EA is interpreted as a floating-point single-precision 
operand. This word is converted to floating-point double-precision and placed into frD. 
(see Appendix D.6,”Floating-Point Load Instructions”). 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Ifsu Ifsu 

Load Floating-Point Single with Update (x’C400 0000’) 

Ifsu frD,d(rA) 


49 

D 

A 

d 


0 

5 6 10 

11 

15 16 

31 


EA 4- (rA) + EXTS (d) 





frD 4- DOUBLE (MEM (EA, 

4) ) 




rA 4- EA 





EA is the sum (rA) + d. 

The word in memory addressed by EA is interpreted as a floating-point single-precision 
operand. This word is converted to floating-point double-precision and placed into frD. 
(see Appendix D.6,”Floating-Point Load Instructions”). 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 
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Ifsux 


Ifsux 

Load Floating-Point Single with Update Indexed (x’7C00 046E’) 
Ifsux frD,rA,rB 


□ Reserved 


31 

D 

A 

B 

567 

0 


0 5 6 10 11 15 16 20 21 30 31 

EA 4- (rA) + (rB) 

frD DOUBLE (MEM (EA, 4)) 

rA A- EA 

EA is the sum (rA) + d. 

The word in memory addressed by EA is interpreted as a floating-point single-precision 
operand. This word is converted to floating-point double-precision and placed into frD. 
(see Appendix D.6,”Floating-Point Load Instructions”). 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Ifsx 


Ifsx 

Load Floating-Point Single Indexed (x’7C00 042E’) 
Ifsx frD,rA,rB 


□ Reserved 


31 

D 

A 

B 

535 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b 4- 0 
else b <r- (rA) 

EA f- b + (rB) 

frD DOUBLE (MEM (EA, 4)) 

EA is the sum (rAIO) + (rB). 

The word in memory addressed by EA is interpreted as a floating-point single-precision 
operand. This word is converted to floating-point double-precision and placed into frD. 
(see Appendix D.6,”Floating-Point Load Instructions”). 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Iha lha 

Load Half Word Algebraic (x’A800 0000’) 
lha rD,d(rA) 


42 

D 

A 

d 


0 56 1011 15 16 31 


if rA = 0 

then b 0 
else b <r- (rA) 

EA <r- b + EXTS (d) 
rD <r- EXTS (MEM (EA, 2)) 


EA is the sum (rAIO) + d. The half word in memory addressed by EA is loaded into the low- 
order 16 bits of rD. The remaining bits in rD are filled with a copy of the most-significant 
bit of the loaded half word. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Ihau lhau 

Load Half Word Algebraic with Update (x’ACOO 0000’) 
lhau rD,d(rA) 


43 

D 

A 

d 


0 5 6 10 11 15 16 31 

EA <- (rA) + EXTS (d) 
rD <r- EXTS (MEM (EA, 2)) 
rA 4- EA 

EA is the sum (rA) + d. The half word in memory addressed by EA is loaded into the low- 
order 16 bits of rD. The remaining bits in rD are filled with a copy of the most-significant 
bit of the loaded half word. 

EA is placed into rA. 

If rA = 0 or rA = rD, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Ihaux 


lhaux 

Load Half Word Algebraic with Update Indexed (x’7C00 02EE’) 
lhaux rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

375 

0 


0 5 6 10 11 15 16 20 21 30 31 

EA <- (rA) + (rB) 
rD f- EXTS (MEM (EA, 2)) 
rA A- EA 

EA is the sum (rA) + (rB). The half word in memory addressed by EA is loaded into the 
low-order 16 bits of rD. The remaining bits in rD are filled with a copy of the most- 
significant bit of the loaded half word. 

EA is placed into rA. 

If rA = 0 or rA = rD, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Ihax 


lhax 

Load Half Word Algebraic Indexed (x’7C00 02AE’) 
lhax rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

343 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + (rB) 

rD 4- EXTS (MEM (EA, 2)) 


EA is the sum (rAIO) + (rB). The half word in memory addressed by EA is loaded into the 
low-order 16 bits of rD. The remaining bits in rD are filled with a copy of the most- 
significant bit of the loaded half word. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Ihbrx 


Ihbrx 

Load Half Word Byte-Reverse Indexed (x’7C00 062C’) 
Ihbrx rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

790 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + (rB) 

rD 4- (16)0 | | MEM (EA + 1, 1) | | MEM (EA, 1) 


EA is the sum (rAIO) + (rB). Bits 0-7 of the half word in memory addressed by EA are 
loaded into the low-order eight bits of rD. Bits 8-15 of the half word in memory addressed 
by EA are loaded into the subsequent low-order eight bits of rD. The remaining bits in rD 
are cleared. 

The PowerPC architecture cautions programmers that some implementations of the 
architecture may run the Ihbrx instructions with greater latency than other types of load 
instructions. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Ihz 


Ihz 

Load Half Word and Zero (x’AOOO 0000’) 
Ihz rD,d(rA) 


40 

D 

A 

d 


0 56 1011 15 16 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + EXTS (d) 

rD 4- (16) 0 | I MEM (EA, 2) 


EA is the sum (rAIO) + d. The half word in memory addressed by EA is loaded into the low- 
order 16 bits of rD. The remaining bits in rD are cleared. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Ihzu Ihzu 

Load Half Word and Zero with Update (x’A400 0000’) 

Ihzu rD,d(rA) 


41 

D 

A 

d 

0 


5 

6 

10 

11 


15 

16 


31 


EA i — 

(rA) 

+ EXTS (d) 








rD 4- 

(16) 0 

| | MEM (EA, 

2) 







r A 4— EA 


EA is the sum (rA) + d. The half word in memory addressed by EA is loaded into the low- 
order 16 bits of rD. The remaining bits in rD are cleared. 

EA is placed into rA. 

If rA = 0 or rA = rD, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 


Chapter 8. Instruction Set 


8-113 











Ihzux 


Ihzux 

Load Half Word and Zero with Update Indexed (x’7C00 026E’) 
Ihzux rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

311 

0 


0 5 6 10 11 15 16 20 21 30 31 

EA 4- (rA) + (rB) 

rD 4— (16)0 || MEM (EA, 2) 

rA4— EA 

EA is the sum (rA) + (rB). The half word in memory addressed by EA is loaded into the 
low-order 16 bits of rD. The remaining bits in rD are cleared. 

EA is placed into rA. 

If rA = 0 or rA = rD, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Ihzx 


Ihzx 

Load Half Word and Zero Indexed (x’7C00 022E’) 
Ihzx rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

279 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then bA 0 
else b 4— (rA) 

EA 4- b + (rB) 

rD 4- (16) 0 | | MEM (EA, 2) 


EA is the sum (rAIO) + (rB). The half word in memory addressed by EA is loaded into the 
low-order 16 bits of rD. The remaining bits in rD are cleared. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Imw 


Imw 

Load Multiple Word (x’B800 0000’) 
Imw rD,d(rA) 


46 

D 

A 

d 


0 56 1011 15 16 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + EXTS (d) 
r 4— rD 

do while r < 31 

GPR(r) 4- MEM (EA, 4) 
r 4— r + 1 
EA 4- EA + 4 

EA is the sum (rAIO) + d. 
n = (32 - rD). 

n consecutive words starting at EA are loaded into GPRs rD through r31. 

EA must be a multiple of four. If it is not, either the system alignment exception handler is 
invoked or the results are boundedly undefined. For additional information about alignment 
and DSI exceptions, see Section 6.4.3, “DSI Exception (0x00300).” 

If rA is in the range of registers specified to be loaded, including the case in which rA = 0, 
the instruction form is invalid. 

NOTE: In some implementations, this instruction is likely to have a greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual load 
or store instructions that produce the same results. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Iswi 

Load String Word Immediate (x’7C00 04AA’) 
lswi rD,rA,NB 



□ Reserved 


31 

D 

A 

NB 

597 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then EA 4— 0 
else EA A- (rA) 
if NB = 0 

then n 4— 32 
else n 4- NB 
r 4— rD - 1 
i 4 — 0 

do while n > 0 

if i = 0 

then r 4— r + 1 (mod 32) 

GPR(r) 4 - (32)0 
GPR(r) [i, i + l) 4 - MEM (EA, 1) 
i 4 - i + 8 

if i = 32 then i 4 — 0 
EA 4 - EA + 1 
0 4 -/ 1 - 1 

EA is (rA 1 0). Let n = NB if NB ^ 0, v=32 if NB = 0; n is the number of bytes to load. 

Let nr = CEIL(n, -r 4); nr is the number of registers to be loaded with data. 
n consecutive bytes starting at EA are loaded into GPRs rD through rD + nr - 1. 

Bytes are loaded left to right in each register. The sequence of registers wraps around to rO 
if required. If the low-order 4 bytes of register rD + nr - 1 are only partially filled, the 
unfilled low-order byte(s) of that register are cleared. 

If rA is in the range of registers specified to be loaded, including the case in which rA = 0, 
the instruction form is invalid. 

Under certain conditions (for example, segment boundary crossing) the data alignment 
exception handler may be invoked. For additional information about data alignment 
exceptions, see Section 6.4.3, “DSI Exception (0x00300).” 

NOTE: In some implementations, this instruction is likely to have greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual load 
or store instructions that produce the same results. 

Other registers altered: None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Iswx 


Iswx 

Load String Word Indexed (x’7C00 042A’) 
Iswx rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

533 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4— b + (rB) 
n 4 — XER[25-31] 
r 4— rD - 1 
if- 0 

rD 4- undefined 
do while n > 0 

if i = 0 

then r 4— r + 1 (mod 32) 

GPR(r) 4- (32)0 
GPR(r) [i, i + 7] 4- MEM (EA, 1) 
i 4- i + 8 
if i = 32 

then i 4— 0 
EA 4- EA + 1 
0 4- n - 1 

EA is the sum (rAIO) + (rB). Let n = XER[25-31]; n is the number of bytes to load. Let 
nr = CEILI/7 4); nr is the number of registers to receive data. If n > 0, n consecutive bytes 
starting at EA are loaded into GPRs rD through rD + nr - 1 . 

Bytes are loaded left to right in each register. The sequence of registers wraps around 
through rO if required. If the low-order four bytes of rD + nr - 1 are only partially filled, 
the unfilled low-order byte(s) of that register are cleared. If n = 0, the contents of rD are 
undefined. 

If rA or rB is in the range of registers specified to be loaded, including the case in which 
rA = 0, either the system illegal instruction error handler is invoked or the results are 
boundedly undefined. 

If rD = rA or rD = rB, the instruction form is invalid. 

If rD and rA both specify GPR0, the form is invalid. 
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Under certain conditions (for example, segment boundary crossing) the data alignment 
exception handler may be invoked. For additional information about data alignment 
exceptions, see Section 6.4.3, “DSI Exception (0x00300).” 

NOTE: In some implementations, this instruction is likely to have a greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual load 
or store instructions that produce the same results. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Iwarx 


Iwarx 

Load Word and Reserve Indexed (x’7C00 0028’) 
Iwarx rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

20 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b 0 
else b <r- (rA) 

EAf- b + (rB) 

RESERVE 1 

RESERVE_ADDR <— physical_addr (EA) 
rD MEM (EA, 4 ) 

EA is the sum (rAIO) + (rB). 

The word in memory addressed by EA is loaded into rD. 

This instruction creates a reservation for use by a store word conditional indexed 
(stwcx.)instruction. The physical address computed from EA is associated with the 
reservation, and replaces any address previously associated with the reservation. 

EA must be a multiple of four. If it is not, either the system alignment exception handler is 
invoked or the results are boundedly undefined. For additional information about alignment 
and DSI exceptions, see Section 6.4.3, “DSI Exception (0x00300).” 

When the RESERVE bit is set, the processor enables hardware snooping for the block of 
memory addressed by the RESERVE address. If the processor detects that another 
processor writes to the block of memory it has reserved, it clears the RESERVE bit. The 
stwcx. instruction will only do a store if the RESERVE bit is set. The stwcx. instruction 
sets the CR0[EQ] bit if the store was successful and clears it if it failed. The Iwarx and 
stwcx. combination can be used for atomic read-modify-write sequences. 

NOTE: The atomic sequence is not guaranteed, but its failure can be detected if 
CR0[EQ] = 0 after the stwcx. instruction. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Iwbrx 


Iwbrx 

Load Word Byte-Reverse Indexed (x’7C00 042C’) 
Iwbrx rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

534 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4- (rA) 

EA 4— b + (rB) 

rD 4- MEM (EA + 3, 1) | MEM (EA + 2, 1) | | MEM (EA + 1, 1) | | MEM (EA, 1) 

EA is the sum (rAIO) + rB. Bits 0-7 of the word in memory addressed by EA are loaded 
into the low-order 8 bits of rD. Bits 8-15 of the word in memory addressed by EA are 
loaded into the subsequent low-order 8 bits of rD. Bits 16-23 of the word in memory 
addressed by EA are loaded into the subsequent low-order eight bits of rD. Bits 24-31 of 
the word in memory addressed by EA are loaded into the subsequent low-order 8 bits of rD. 

The PowerPC architecture cautions programmers that some implementations of the 
architecture may run the Iwbrx instructions with greater latency than other types of load 
instructions. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Iwz 


Iwz 

Load Word and Zero (x’8000 0000’) 
Iwz rD,d(rA) 


32 

D 

A 

d 


0 56 1011 15 16 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 
EA 4- b + EXTS (d) 
rD 4- MEM (EA, 4) 


EA is the sum (rAIO) + d. The word in memory addressed by EA is loaded into rD. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Iwzu Iwzu 

Load Word and Zero with Update (x’8400 0000’) 

Iwzu rD,d(rA) 


33 

D 

A 

d 


0 56 1011 15 16 31 


EA 4 — (rA) + EXTS (d) 
rD 4- MEM (EA, 4) 
r A 4— EA 


EA is the sum (rA) + d. The word in memory addressed by EA is loaded into rD. 
EA is placed into rA. 

If rA = 0, or rA = rD, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Iwzux 


Iwzux 

Load Word and Zero with Update Indexed (x’7C00 006E’) 
Iwzux rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

55 

0 


0 5 6 10 11 15 16 20 21 30 31 

EA f- (rA) + (rB) 
rD MEM (EA, 4) 
r A <— EA 

EA is the sum (rA) + (rB). The word in memory addressed by EA is loaded into rD. 

EA is placed into rA. 

If rA = 0, or rA = rD, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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Iwzx 


Iwzx 

Load Word and Zero Indexed (x’7C00 002E’) 
Iwzx rD,rA,rB 


□ Reserved 


31 

D 

A 

B 

23 

0 


0 5 6 10 11 15 16 20 21 30 31 


if rA = 0 

then b 0 
else b <— (rA) 
EA<- b + (rB) 
rD i — MEM (EA, 4) 


EA is the sum (rAIO) + (rB). The word in memory addressed by EA is loaded into rD. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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mcrf 


mcrf 

Move Condition Register Field (x’4C00 0000’) 
mcrf crfD,crfS 


□ Reserved 


19 

crfD 

0 0 

crfS 

0 0 

0 0 0 0 0 

0000000000 

0 


0 5 6 8 9 10 11 13 14 15 16 20 21 30 31 


CR [ (4 * crfD) - (4 * crfD + 3) ] 4- CR[ (4 * crfS) - (4 * crfS + 3) ] 

The contents of condition register field crfS are copied into condition register field crfD. 
All other condition register fields remain unchanged. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 

Affected: LT, GT, EQ, SO 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XL 
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mcrfs 


mcrfs 

Move to Condition Register from FPSCR (x’FCOO 0080’) 
mcrfs crfD,crfS 


□ Reserved 


63 

crfD 

0 0 

crfS 

0 0 

0 0 0 0 0 

64 

0 


0 5 6 8 9 10 11 13 14 15 16 20 21 30 31 


The contents of FPSCR field crfS are copied to CR field crfD. All exception bits copied 
(except FEX and VX) are cleared in the FPSCR. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 

Affected: FX, FEX, VX, OX 

• Floating-Point Status and Control Register: 

Affected: FX, OX (if crfS = 0) 

Affected: UX, ZX, XX, VXSNAN (if crfS = 1) 

Affected: VXISI, VXIDI, VXZDZ, VXIMZ (if crfS = 2) 

Affected: VXVC (if crfS = 3) 

Affected: VXSOFT, VXSQRT, VXCVI (if crfS = 5) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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mcrxr 

Move to Condition Register from XER (x’7C00 0400’) 


mcrxr 


mcrxr 


crf'I) 


□ Reserved 


31 

crfD 

0 0 

0 0 0 0 0 

0000 0 

512 

0 


0 5 6 8 9 10 11 15 16 20 21 30 31 

CR [ (4 * erf D ) - ( 4 * crfD + 3) ] 4- XER[0-3] 

XER [0-3] <- ObOOOO 


The contents of XER[0-3] are copied into the condition register field designated by crfD. 
All other fields of the condition register remain unchanged. XER[0-3] is cleared. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 

Affected: LT, GT, EQ, SO 

• XER [0-3] 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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mfcr mfcr 

Move from Condition Register ( (x’7C00 0026’) 
mfcr rD 


| | Reserved 


31 

D 

0 0000 

0 0 0 0 0 

19 

0 


0 5 6 10 11 15 16 20 21 30 31 


rD 4- CR 

The contents of the condition register (CR) are placed into rD. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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mffsx 


mffsx 

Move from FPSCR (x’FCOO 048E’) 

mffs frD (Rc = 0) 

mffs. frD (Rc = 1) 


1 | Reserved 


63 

D 

0 0000 

0 0 0 0 0 

583 

Rc 


0 5 6 10 11 15 16 20 21 30 31 


frD [32-63] FPSCR 

The contents of the floating-point status and control register (FPSCR) are placed into the 
low-order bits of register frD. The high-order bits of register frD are undefined. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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mfmsr 


mfmsr 

Move from Machine State Register (x’7C00 00A6’) 
mfmsr rD 


| | Reserved 


31 

D 

0 0 0 0 0 

0000 0 

83 

0 


0 5 6 10 11 15 16 20 21 30 31 


rD <r- MSR 

The contents of the MSR are placed into rD. 
This is a supervisor-level instruction. 

Other registers altered 
None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

yes 


X 
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mfspr mfspr 

Move from Special-Purpose Register (x’7C00 02A6’) 
mfspr rD,SPR 


I | Reserved 


31 

D 

spr* 

339 

0 


0 5 6 10 11 20 21 30 31 


NOTE: *This is a split field. 

n <r- spr[5-9] | | spr[0-4] 

rD v- SPR(n) 

In the PowerPC UISA, the SPR field denotes a special-purpose register, encoded as shown 
in Table 8-9.The contents of the designated special purpose register are placed into rD 


Table 8-9. PowerPC UISA SPR Encodings for mfspr 


SPR** 

Register Name 

Decimal 

spr[5-9] 

spr[0-4] 

1 

00000 

00001 

XER 

8 

00000 

01000 

LR 

9 

00000 

01001 

CTR 


** Note: The order of the two 5-bit halves of the SPR number 
is reversed compared with the actual instruction coding. 


If the SPR field contains any value other than one of the values shown in Table 8-9 (and the 
processor is in user mode), one of the following occurs: 

• The system illegal instruction error handler is invoked. 

• The system supervisor-level instruction error handler is invoked. 

• The results are boundedly undefined. 

Other registers altered: 

• None 


Simplified mnemonics: 



mfxer rD 

equivalent to 

mfspr rD,l 

mfir rD 

equivalent to 

mfspr rD,8 

mfctr rD 

equivalent to 

mfspr rD,9 
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In the PowerPC OEA, the SPR field denotes a special-purpose register, encoded as shown 
in Table 8-10. The contents of the designated SPR are placed into rD. 

In the PowerPC UISA, the SPR field denotes a special-purpose register, encoded as shown 
in Table 8-10. If the SPR[0] = 0 (Access type User), the contents of the designated SPR are 
placed into rD. 

NOTE: For this instruction (mfspr), SPR[0] = 1 is supervisor-level, if and only if reading 
the register. Execution of this instruction specifying a defined and supervisor- 
level register when MSR[PR] = 1 results in a privileged instruction type program 
exception. 

If MSR[PR] = 1, the only effect of executing an instruction with an SPR number that is not 
shown in Table 8-10 and has SPR[0] = 1 is to cause a supervisor-level instruction type 
program exception or an illegal instruction type program exception. For all other cases, 
MSR[PR] = 0 or SPR[0] = 0. If the SPR field contains any value that is not shown in 
Table 8-10, either an illegal instruction type program exception occurs or the results are 
boundedly undefined. 

Other registers altered: 

None 

Table 8-10. PowerPC OEA SPR Encodings for mfspr 


SPR 1 

Register 

Name 

Access 

Decimal 

spr[5-9] 

spr[0-4] 

1 

00000 

00001 

XER 

User 

8 

00000 

01000 

LR 

User 

9 

00000 

01001 

CTR 

User 

18 

00000 

10010 

DSISR 

Supervisor 

19 

00000 

10011 

DAR 

Supervisor 

22 

00000 

10110 

DEC 

Supervisor 

25 

00000 

11001 

SDR1 

Supervisor 

26 

00000 

11010 

SRR0 

Supervisor 

27 

00000 

11011 

SRR1 

Supervisor 

272 

01000 

10000 

SPRG0 

Supervisor 

273 

01000 

10001 

SPRG1 

Supervisor 

274 

01000 

10010 

SPRG2 

Supervisor 

275 

01000 

10011 

SPRG3 

Supervisor 

282 

01000 

11010 

EAR 

Supervisor 

287 

01000 

11111 

PVR 

Supervisor 
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Table 8-10. PowerPC OEA SPR Encodings for mfspr (Continued) 


SPR 1 

Register 

Name 

Access 

Decimal 

spr[5-9] 

spr[0-4] 

528 

10000 

10000 

IBAT0U 

Supervisor 

529 

10000 

10001 

IBAT0L 

Supervisor 

530 

10000 

10010 

IBAT1U 

Supervisor 

531 

10000 

10011 

IBAT1L 

Supervisor 

532 

10000 

10100 

IBAT2U 

Supervisor 

533 

10000 

10101 

IBAT2L 

Supervisor 

534 

10000 

10110 

IBAT3U 

Supervisor 

535 

10000 

10111 

IBAT3L 

Supervisor 

536 

10000 

11000 

DBAT0U 

Supervisor 

537 

10000 

11001 

DBAT0L 

Supervisor 

538 

10000 

11010 

DBAT1U 

Supervisor 

539 

10000 

11011 

DBAT1 L 

Supervisor 

540 

10000 

11100 

DBAT2U 

Supervisor 

541 

10000 

11101 

DBAT2L 

Supervisor 

542 

10000 

11110 

DBAT3U 

Supervisor 

543 

10000 

11111 

DBAT3L 

Supervisor 

1013 

11111 

10101 

DABR 

Supervisor 


^ote: The order of the two 5-bit halves of the SPR number is reversed 
compared with actual instruction coding. 

For mtspr and mfspr instructions, the SPR number coded in assembly 
language does not appear directly as a 1 0-bit binary number in the 
instruction. The number coded is split into two 5-bit halves that are 
reversed in the instruction, with the high-order five bits appearing in bits 
16-20 of the instruction and the low-order five bits in bits 11-15. 


NOTE: mfspr is supervisor-level only if SPR[0] = 1. 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA/OEA 

yes* 


XFX 
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mfsr mfsr 

Move from Segment Register (x’7C00 04A6’) 
mfsr rD,SR 


j | Reserved 


31 

D 

0 

SR 

0000 0 

595 

0 


0 5 6 10 11 12 15 16 20 21 30 31 


rD <r- SEGREG (SR) 

The contents of the segment register SR are copied into rD. 
This is a supervisor-level instruction. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

yes 


X 
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mfsrin 


mfsrin 

Move from Segment Register Indirect (x’7C00 0526’) 
mfsrin rD,rB 


j | Reserved 


31 

D 

0 0 0 0 0 

B 

659 

0 


0 5 6 10 11 15 16 20 21 30 31 


rD 4- SEGREG (rB [0-3] ) 

The contents of the segment register selected by bits 0-3 of rB are copied into rD. 

This is a supervisor-level instruction. 

The rA field is not defined for the mfsrin instruction in the PowerPC architecture. 
However, mfsrin performs the same function in the PowerPC architecture as does the mfsri 
instruction in the POWER architecture (if rA = 0). 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

yes 


X 
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mftb 


mftb 

Move from Time Base (x’7C00 02E6’) 
mftb rD,TBR 


j | Reserved 


31 

D 

tbr* 

371 

0 


0 56 1011 20 21 30 31 

NOTE: This is a split field. 


n 4- tbr [5-9] | | tbr[0-4] 
if n = 268 

then rD 4— TBL 
else if n = 269 

then rD 4— TBU 

else error (invalid TBR field) 


The contents of the designated register are copied into rD. The TBR field denotes either the 
TBL or TBU, encoded as shown in Table 8-11. 


Table 8-1 1 . TBR Encodings for mftb 


TBR* 

Register 

Name 

Access 

Decimal 

tbr[5— 9] 

tbr[0— 4] 

268 

01000 

01100 

TBL 

User 

269 

01000 

01101 

TBU 

User 


‘Note: The order of the two 5-bit halves of the TBR number is 
reversed. 


If the TBR field contains any value other than one of the values shown in Table 8-11, then 
one of the following occurs: 

• The system illegal instruction error handler is invoked. 

• The system supervisor-level instruction error handler is invoked. 

• The results are boundedly undefined. 

It is important to note that some implementations may implement mftb and mfspr 
identically, therefore, a TBR number must not match an SPR number. 

For more information on the time base refer to Section 2.2, “PowerPC VEA Register 
Set — Time Base.” 

Other registers altered: 

• None 
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Simplified mnemonics: 

mftb rD equivalent to 

mftbu rD equivalent to 


mftb rD,268 
mftb rD,269 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

VEA 



XFX 
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mtcrf 


mtcrf 

Move to Condition Register Fields (x’7C00 0120’) 
mtcrf CRM,rS 


j | Reserved 


31 

S 

0 

CRM 

0 

144 

0 


0 5 6 10 11 12 19 20 21 30 31 

mask 4— (4 ) (CRM [ 0 ] ) || (4) (CRM [1] ) ||... (4) (CRM[7] > 

CR 4— (rS & mask) | (CR & mask) 


The contents of rS are placed into the condition register under control of the field mask 
specified by CRM. The field mask identifies the 4-bit fields affected. Let i be an integer in 
the range 0-7. If CRM(i) = 1, CR field i (CR bits 4 * i through 4 * i + 3) is set to the contents 
of the corresponding field of rS. 

NOTE: Updating a subset of the eight fields of the condition register may have 

substantially poorer performance on some implementations than updating all of 
the fields. 

Other registers altered: 

• CR fields selected by mask 

Simplified mnemonics: 

mtcr rS equivalent to mtcrf 0xFF,rS 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XFX 
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mtfsbOx 


mtfsbOx 

Move to FPSCR Bit 0 (x’FCOO 008C’) 

mtfsbO crbD (Rc = 0) 

mtfsbO. crbD (Rc = 1) 


1 | Reserved 


63 

crbD 

0 0000 

0 0 0 0 0 

70 



0 5 6 10 11 15 16 20 21 30 31 

FPSRC [crbD] 4— 0 


Bit crbD of the FPSCR is cleared. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPSCR bit crbD 

NOTE: Bits 1 and 2 (FEX and VX) cannot be explicitly cleared. 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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mtfsbl x 

Move to FPSCR Bit 1 (x’FCOO 004C’) 


mtfsblx 

mtfsbl 

crbD 

(Re = 0) 


mtfsbl. 

crbD 

(Re = 1) 

j | Reserved 


63 

crbD 

0 0 0 0 0 

0000 0 

38 


0 5 6 10 11 15 16 20 21 30 31 

FPSRC [crbD] 1 

Bit crbD of the FPSCR is set. 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Re = 1) 

• Floating-Point Status and Control Register: 

Affected: FPSCR bit crbD and FX 

NOTE: Bits 1 and 2 (FEX and VX) cannot be explicitly set. 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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mtfsfx 


mtfsfx 

Move to FPSCR Fields (x’FCOO 058E’) 

mtfsf FM,frB (Rc = 0) 

mtfsf. FM,frB (Rc = 1) 


| | Reserved 


63 

0 

FM 

0 

B 

711 



0 5 6 7 14 15 16 20 21 30 31 


The low-order 32 bits of frB are placed into the FPSCR under control of the field mask 
specified by FM. The field mask identifies the 4-bit fields affected. Let i be an integer in the 
range 0-7. If FM[i] = 1, FPSCR field i (FPSCR bits 4 * i through 4 * i + 3) is set to the 
contents of the corresponding field of the low-order 32 bits of register frB. 

FPSCR[FX] is altered only if FM[0] = 1. 

Updating fewer than all eight fields of the FPSCR may have substantially poorer 
performance on some implementations than updating all the fields. 

When FPSCR[0-3] is specified, bits 0 (FX) and 3 (OX) are set to the values of frB [32] and 
frB [35] (that is, even if this instruction causes OX to change from 0 to 1, FX is set from 
frB [32] and not by the usual rule that FX is set when an exception bit changes from 0 to 1). 
Bits 1 and 2 (FEX and VX) are set according to the usual rule and not from frB [33-34], 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPSCR fields selected by mask 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XFL 
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mtfsfi* mtfsfi* 

Move to FPSCR Field Immediate (x’FCOO 01 OC’) 

mtfsfi crfD,IMM (Rc = 0) 

mtfsfi. crfD,IMM (Rc = 1) 


| | Reserved 


63 

crfD 

00 

0 0000 


0 

134 



0 5 6 8 9 10 11 12 15 16 19 20 21 30 31 


FPSCR [ erf D] <- IMM 

The value of the IMM field is placed into FPSCR field crfD. 

FPSCR[FX] is altered only if crfD = 0. 

When FPSCR[0-3] is specified, bits 0 (FX) and 3 (OX) are set to the values of IMM[0] and 
IMM[3] (that is, even if this instruction causes OX to change from 0 to 1, FX is set from 
IMM[0] and not by the usual rule that FX is set when an exception bit changes from 0 to 
1). Bits 1 and 2 (FEX and VX) are set according to the usual rule and not from IMM[l-2], 

Other registers altered: 

• Condition Register (CR1 field): 

Affected: FX, FEX, VX, OX (If Rc = 1) 

• Floating-Point Status and Control Register: 

Affected: FPSCR field crfD 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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mtmsr 

Move to Machine State Register (x’7C00 0124’) 


mtmsr 


mtmsr 


rS 


j | Reserved 


31 

S 

0 0 0 0 0 

0000 0 

146 

0 


0 56 1011 1516 20 21 30 31 


MSR 4— (rS) 

The contents of rS are placed into the MSR. 

This is a supervisor-level instruction. It is also an execution synchronizing instruction 
except with respect to alterations to the POW and LE bits. Refer to Section 2.3.18, 
“Synchronization Requirements for Special Registers and for Lookaside Buffers,” for more 
information. 

In addition, alterations to the MSR[EE] and MSR[RI] bits are effective as soon as the 
instruction completes. Thus if MSR[EE] = 0 and an external or decrementer exception is 
pending, executing an mtmsr instruction that sets MSR[EE] = 1 will cause the external or 
decrementer exception to be taken before the next instruction is executed, if no higher 
priority exception exists. 

Other registers altered: 

• MSR 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

yes 


X 
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mtspr mtspr 

Move to Special-Purpose Register (x’7C00 03A6’) 
mtspr SPR,rS 


| | Reserved 


31 

S 

spr* 

467 

0 


0 5 6 10 11 20 21 30 31 

NOTE: This is a split field. 


n 4— spr[5-9] | | spr[0-4] 

SPR(n) 4- (rS) 


In the PowerPC UISA, the SPR field denotes a special-purpose register, encoded as shown 
in Table 8-12. The contents of rS are placed into the designated special-purpose register. 

Table 8-12. PowerPC UISA SPR Encodings for mtspr 


SPR** 

Register Name 

Decimal 

spr[5-9] 

spr[0-4] 

1 

00000 

00001 

XER 

8 

00000 

01000 

LR 

9 

00000 

01001 

CTR 


** Note: The order of the two 5-bit halves of the SPR number 
is reversed compared with actual instruction coding. 


If the SPR field contains any value other than one of the values shown in Table 8-12, and 
the processor is operating in user mode, one of the following occurs: 

• The system illegal instruction error handler is invoked. 

• The system supervisor instruction error handler is invoked. 

• The results are boundedly undefined. 

Other registers altered: 

• See Table 8-12. 


Simplified mnemonics: 


mtxer 

rD 

equivalent to 

mtspr 

l,rD 

mtlr 

rD 

equivalent to 

mtspr 

8,rD 

mtctr 

rD 

equivalent to 

mtspr 

9,rD 
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In the PowerPC OEA, the SPR field denotes a special-purpose register, encoded as shown 
in Table 8-13. The contents of rS are placed into the designated special-purpose register. 

In the PowerPC UISA, if the SPR[0]=0 (Access is User) the contents of rS are placed into 
the designated special-purpose register. 

For this instruction, SPRs TBL and TBU are treated as separate 32-bit registers; setting one 
leaves the other unaltered. 

The value of SPR[0] = 1 if and only if writing the register is a supervisor-level operation. 
Execution of this instruction specifying a defined and supervisor-level register when 
MSR[PR] = 1 results in a privileged instruction type program exception. 

If MSR[PR] = 1 then the only effect of executing an instruction with an SPR number that 
is not shown in Table 8-13 and has SPR[0] = 1 is to cause a privileged instruction type 
program exception or an illegal instruction type program exception. For all other cases, 
MSR[PR] = 0 or SPR[0] = 0, if the SPR field contains any value that is not shown in 
Table 8-13, either an illegal instruction type program exception occurs or the results are 
boundedly undefined. 

Other registers altered: 

• See Table 8-13. 

Table 8-13. PowerPC OEA SPR Encodings for mtspr 


SPR 1 

Register 

Name 

Access 

Decimal 

spr[5-9] 

spr[0-4] 

1 

00000 

00001 

XER 

User 

8 

00000 

01000 

LR 

User 

9 

00000 

01001 

CTR 

User 

18 

00000 

10010 

DSISR 

Supervisor 

19 

00000 

10011 

DAR 

Supervisor 

22 

00000 

10110 

DEC 

Supervisor 

25 

00000 

11001 

SDR1 

Supervisor 

26 

00000 

11010 

SRR0 

Supervisor 

27 

00000 

11011 

SRR1 

Supervisor 

272 

01000 

10000 

SPRG0 

Supervisor 

273 

01000 

10001 

SPRG1 

Supervisor 

274 

01000 

10010 

SPRG2 

Supervisor 

275 

01000 

10011 

SPRG3 

Supervisor 

282 

01000 

11010 

EAR 

Supervisor 
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Table 8-13. PowerPC OEA SPR Encodings for mtspr (Continued) 


SPR 1 

Register 

Name 

Access 

Decimal 

spr[5-9] 

spr[0-4] 

284 

01000 

11100 

TBL 

Supervisor 

285 

01000 

11101 

TBU 

Supervisor 

528 

10000 

10000 

IBAT0U 

Supervisor 

529 

10000 

10001 

IBAT0L 

Supervisor 

530 

10000 

10010 

IBAT1U 

Supervisor 

531 

10000 

10011 

IBAT1L 

Supervisor 

532 

10000 

10100 

IBAT2U 

Supervisor 

533 

10000 

10101 

IBAT2L 

Supervisor 

534 

10000 

10110 

IBAT3U 

Supervisor 

535 

10000 

10111 

IBAT3L 

Supervisor 

536 

10000 

11000 

DBAT0U 

Supervisor 

537 

10000 

11001 

DBAT0L 

Supervisor 

538 

10000 

11010 

DBAT1U 

Supervisor 

539 

10000 

11011 

DBAT1L 

Supervisor 

540 

10000 

11100 

DBAT2U 

Supervisor 

541 

10000 

11101 

DBAT2L 

Supervisor 

542 

10000 

11110 

DBAT3U 

Supervisor 

543 

10000 

11111 

DBAT3L 

Supervisor 

1013 

11111 

10101 

DABR 

Supervisor 


^ote: The order of the two 5-bit halves of the SPR number is reversed. For mtspr and 
mfspr instructions, the SPR number coded in assembly language does not appear 
directly as a 1 0-bit binary number in the instruction. The number coded is split into two 
5-bit halves that are reversed in the instruction, with the high-order five bits appearing 
in bits 16-20 of the instruction and the low-order five bits in bits 11-15. 


NOTE: mtspr is supervisor-level only if SPR[0] = 1. 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA/OEA 

yes* 


XFX 
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mtsr 

Move to Segment Register (x’7C00 01 A4’) 


mtsr 


mtsr SR,rS 


j | Reserved 


31 

S 

0 

SR 

0000 0 

210 

0 


0 5 6 10 11 12 15 16 20 21 30 31 


SEGREG (SR) 4- (rS) 

The contents of rS are placed into SR. 
This is a supervisor-level instruction. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

yes 


X 
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mtsrin 

Move to Segment Register Indirect (x’7C00 01 E4’) 


mtsrin 


mtsrin 


rS,rB 


j | Reserved 


31 

S 

0 0000 

B 

242 

0 


0 56 1011 1516 20 21 30 31 


SEGREG (rB [0-3] ) 4— (rS) 

The contents of rS are copied to the segment register selected by bits 0-3 of rB. 

This is a supervisor-level instruction. 

NOTE: The PowerPC architecture does not define the rA field for the mtsrin instruction. 
However, mtsrin performs the same function in the PowerPC architecture as 
does the mtsri instruction in the POWER architecture (if rA = 0). 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

yes 


X 
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mulhwx 


mulhwx 

Multiply High Word (x’7C00 0096’) 

mulhw rD,rA,rB (Rc = 0) 

mulhw. rD,rA,rB (Rc = 1) 


| | Reserved 


31 

D 

A 

B 

0 

75 

Rc 


0 5 6 10 11 15 16 20 21 22 30 31 


prod [0-63] <— (rA) * (rB) 
rD <r- prod [0-31] 


The 64-bit product is formed from the contents of rA and rB. The high-order 32 bits of the 
64-bit product of the operands are placed into rD. 

Both the operands and the product are interpreted as signed integers. 

This instruction may execute faster on some implementations if rB contains the operand 
having the smaller absolute value. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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mulhwux 


mulhwux 

Multiply High Word Unsigned (x’7C00 0016’) 

mulhwu rD,rA,rB (Rc = 0) 

mulhwu. rD,rA,rB (Rc = 1) 


□ Reserved 


31 

D 

A 

B 

B 

ii 



0 5 6 10 11 15 16 20 21 22 30 31 


prod [0-63] (rA) * (rB) 
rD <r- prod [0-31] 

The 32-bit operands are the contents of rA and rB. The high-order 32 bits of the 64-bit 
product of the operands are placed into rD. 

Both the operands and the product are interpreted as unsigned integers, except that if 
Rc = 1 the first three bits of CR0 field are set by signed comparison of the result to zero. 

This instruction may execute faster on some implementations if rB contains the operand 
having the smaller absolute value. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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mulli 


mulli 

Multiply Low Immediate (x’ICOO 0000’) 
mulli rD,rA,SIMM 


07 

D 

A 

SIMM 


0 56 10 11 15 16 31 

prod [0-63] 4— (rA) * EXTS (SIMM) 
rD f- prod [32-63] 


The first operand is (rA). The second operand is the sign-extended value of the SIMM field. 
The low-order 32-bits of the 64-bit product of the operands are placed into rD. 

Both the operands and the product are interpreted as signed integers. The low-order 32-bits 
of the product are calculated independently of whether the operands are treated as signed 
or unsigned 32-bit integers. 

This instruction can be used with mulhdr or mulhwi to calculate a full 64-bit product. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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mullwx 


mullwx 

Multiply Low Word (x’7C00 01 D6’) 


mullw 

mullw. 

niullwo 

mullwo. 


rD,rA,rB (OE = 0 Rc = 0) 
rD,rA,rB (OE = 0 Rc = 1) 
rD,rA,rB (OE = 1 Rc = 0) 
rD,rA,rB (OE = lRc = l) 


31 

D 

A 

B 

m 

235 



0 5 6 10 11 15 16 20 21 22 30 31 


prod [0-63] <r- (rA) * (rB) 
rD f- prod [32-63] 


The 32-bit operands are the contents of rA and rB. The low-order 32-bits of the 64-bit 
product (rA) * (rB) are placed into rD. 

The low-order 32-bits of the product are independent of whether the operands are regarded 
as signed or unsigned 32-bit integers. 

If OE = 1, then OV is set if the product cannot be represented in 32 bits. Both the operands 
and the product are interpreted as signed integers. 

This instruction can be used with mulhw.r to calculate a full 64-bit product. 

NOTE: This instruction may execute faster on some implementations if rB contains the 
operand having the smaller absolute value. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 

NOTE: CR0 field may not reflect the infinitely precise result if overflow occurs (see 
next). 

• XER: 

Affected: SO, OV (If OE = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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nandx 


nandx 

NAND (x’7C00 03B8’) 

nand rA,rS,rB (Rc = 0) 

nand. rA,rS,rB (Rc = 1) 


31 

S 

A 

B 

476 



0 56 1011 1516 20 21 30 31 


r A ( (rS) & (rB) ) 

The contents of rS are ANDed with the contents of rB and the complemented result is 
placed into rA. 

nand with rS = rB can be used to obtain the one's complement. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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neg* 


neg* 

Negate (x’7C00 OODO’) 


neg 

rD,rA 

(OE = 0 Re = 0) 

neg. 

rD,rA 

(OE = 0 Re = 1) 

nego 

rD,rA 

(OE = 1 Re = 0) 

nego. 

rD,rA 

(OE = 1 Re = 1) 


| | Reserved 


31 

D 

A 

0 0 0 0 0 

OE 

104 

Rc 

0 56 10 11 15 16 20 

21 22 3C 

I 31 


rD A- (rA) + 1 

The value 1 is added to the one’s complement of the value in rA, and the resulting two’s 
complement is placed into rD. 

If rA contains the most negative 32-bit number (0x8000_0000), the result is the most 
negative number and, if OE = 1, OV is set. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Re = 1) 

• XER: 

Affected: SO OV (IfOE=l) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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nor* 

NOR (x’7C00 00F8’) 


nor* 


nor 

rA,rS,rB 

(Re = 0) 



nor. 

rA,rS,rB 

(Re = 1) 



31 

s 

A 

B 

124 



0 5 6 10 11 15 16 20 21 30 31 


rA <r- ( (rS) | (rB) ) 

The contents of rS are ORed with the contents of rB and the complemented result is placed 
into rA. 

nor with rS = rB can be used to obtain the one’s complement. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Re = 1) 

Simplified mnemonics: 

not rD,rS equivalent to nor rA,rS,rS 


PowerPC Architecture Level 
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PowerPC Optional 

Form 
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or* 

OR (x’7C00 0378’) 


or* 


or 

rA,rS,rB 

(Re = 0) 



or. 

rA,rS,rB 

(Re = 1) 



31 

s 

A 

B 

444 



0 5 6 10 11 15 16 20 21 30 31 


r A 4— (rS) | (rB) 

The contents of rS are ORed with the contents of rB and the result is placed into rA. 

The simplified mnemonic mr (shown below) demonstrates the use of the or instruction to 
move register contents. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Re = 1) 

Simplified mnemonics: 

mr rA,rS equivalent to or rA,rS,rS 
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orcx 

OR with Complement (x’7C00 0338’) 


ore* 


ore rA,rS,rB (Rc = 0) 

ore. rA,rS,rB (Re = 1) 


31 

S 

A 

B 

412 



0 5 6 10 11 15 16 20 21 30 31 


rA <r- (rS) | -■ (rB) 

The contents of rS are ORed with the complement of the contents of rB and the result is 
placed into rA. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (If Rc = 1) 
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ori ori 

OR Immediate (x’6000 0000’) 
ori rA,rS,UIMM 


24 

S 

A 

UIMM 


0 56 10 11 15 16 31 


r A 4— (rS) | ((16)0 || UIMM) 

The contents of rS are ORed with 0x0000 II UIMM and the result is placed into rA. 
The preferred no-op (an instruction that does nothing) is ori 0,0,0. 

Other registers altered: 

• None 

Simplified mnemonics: 

nop equivalent to ori 0,0,0 
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oris oris 

OR Immediate Shifted (x’6400 0000’) 
oris rA,rS,UIMM 


25 

S 

A 

UIMM 


0 56 10 11 15 16 31 


r A 4— (rS) | (UIMM || (16)0) 

The contents of rS are ORed with UIMM II 0x0000 and the result is placed into rA. 

Other registers altered: 

• None 
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Return from Interrupt (x’4C00 0064’) 


□ Reserved 


19 

00 000 

0 0000 

0000 0 

50 

0 


0 56 1011 1516 20 21 30 31 


MSR[0, 5-9, 16-23, 25-27, 30-31] <r- SRR1 [0, 5-9, 16-23, 25-27, 30-31] 

MSR[13] <r- 0 

NIA<-ieaSRR0[0-29] || ObOO 

Bits SRR1 [0,5-9,16-23, 25-27, 30-31] are placed into the corresponding bits of the MSR. 
MSR[13] is set to 0. If the new MSR value does not enable any pending exceptions, then 
the next instruction is fetched, under control of the new MSR value, from the address 
SRR0[0-29] II ObOO. If the new MSR value enables one or more pending exceptions, the 
exception associated with the highest priority pending exception is generated; in this case 
the value placed into SRR0 by the exception processing mechanism is the address of the 
instruction that would have been executed next had the exception not occurred. 

NOTE: An implementation may define additional MSR bits, and in this case, may also 
cause them to be saved to SRR1 from MSR on an exception and restored to MSR 
from SRR1 on an rfi. 

This is a supervisor-level, context synchronizing instruction. 

Other registers altered: 

• MSR 
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rlwimix 


rlwimix 

Rotate Left Word Immediate then Mask Insert (x’5000 0000’) 

rlwimi rA,rS,SH,MB,ME (Rc = 0) 

rlwimi. rA,rS,SH,MB,ME (Rc = 1) 


20 

S 

A 

SH 

MB 

ME 



0 5 6 10 11 15 16 20 21 25 26 30 31 


rur- SH 

r ROTL (rS, 
m<- MASK (MB, 
r A (r & m) 


n) 

ME) 

I (rA & i m) 


The contents of rS are rotated left the number of bits specified by operand SH. A mask is 
generated having 1 bits from bit MB through bit ME and 0 bits elsewhere. The rotated data 
is inserted into rA under control of the generated mask. 

NOTE: rlwimi can be used to copy a bit field of any length from register rS into the 

contents of rA. This field can start from any bit position in rS and be placed into 
any position in rA. The length of the field can range from 0 to 32 bits. The 
remaining bits in register rA remain unchanged: 

• To copy byte_0 (bits 0-7) from rS into byte_3 (bits 24-31) of rA, set SH = 8 , MB = 
24, and ME = 31. 

• In general, to copy an n-bit field that starts in bit position b in register rS into register 
rA starting a bit position c: set SH = 32 - c + b Mod(32), set MB = c, and set ME = 
(c + n) - 1 Mod(32). 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Rc = 1) 

Simplified mnemonics: 

inslwi rA,rS,n,b equivalenttorlwimirA,rS,32 - b,b,b + n - 1 

insrwi rA,rS,n,b (n > 0)equivalent to rlwimi rA,rS, 32 - (b + n),b, (b + n) - 1 
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rlwinmx 


rlwinmx 

Rotate Left Word Immediate then AND with Mask (x’5400 0000’) 

rlwinm rA,rS,SH,MB,ME (Rc = 0) 

rlwinm. rA,rS,SH,MB,ME (Rc = 1) 


21 

S 

A 

SH 

MB 

ME 



0 5 6 10 11 15 16 20 21 25 26 30 31 


n<— SH 

r <- ROTL (rS, n) 
mf- MASK (MB , ME) 
r A <— r & m 


The contents of rS are rotated left the number of bits specified by operand SH. A mask is 
generated having 1 bits from bit MB through bit ME and 0 bits elsewhere. The rotated data 
is ANDed with the generated mask and the result is placed into rA. 

NOTE: rlwinm can be used to extract, rotate, shift, and clear bit fields using the methods 
shown below: 

• To extract an n-bit field, that starts at bit position b in rS, right-justified into rA 
(clearing the remaining 32 - n bits of rA), set SH = b + n, 

MB = 32 -n, and ME = 31. 

• To extract an n-bit field, that starts at bit position b in rS, left-justified into rA 
(clearing the remaining 32 - n bits of rA), set SH = Zx MB = 0, and ME = n - 1. 

• To rotate the contents of a register left (or right) by n bits, set SH = n (32 - n ), 

MB = 0, and ME = 31. 

• To shift the contents of a register right by n bits, by setting SH = 32 - n, MB = n, and 
ME = 3 1 . It can be used to clear the high-order b bits of a register and then shift the 
result left by n bits by setting SH = n, MB = b -n and ME = 31- n. 

• To clear the low-order n bits of a register, by setting SH = 0, MB = 0, and 
ME = 31 -n.. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Rc = 1) 
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Simplified mnemonics: 


extlwi rA,rS ,n,b (n > 0 ) 
extrwi rA,rS ,n,b (n > 0 ) 
rotlwi rA,rS,/? 
rotrwi rA,rS,n 
slwi rA,rS,n (n < 32 ) 
srwi rA,rS,n (n < 32 ) 
clrlwi rA,rS,n (n < 32 ) 
clrrwi rA,rS,n (n < 32 ) 
clrlslwi rA,rS ,b,n (n < b < 32 ) 


equivalent to 

rlwinm 

equivalent to 

rlwinm 

equivalent to 

rlwinm 

equivalent to 

rlwinm 

equivalent to 

rlwinm 

equivalent to 

rlwinm 

equivalent to 

rlwinm 

equivalent to 

rlwinm 

equivalent to 

rlwinm 


rA,rS,b,0,n - 1 
rA,rS,b + n, 32 - n, 31 
rA,rS,n,0,31 
rA,rS,32 - n,0,31 
rA,rS,n,0,31-n 
rA,rS,32 - n,n,31 
rA,rS,0,n,31 
rA,rS,0,0,31 - n 
rA,rS ,n,b -n, 31- n 
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rlwnmx 


rlwnmx 

Rotate Left Word then AND with Mask (x’5C00 0000’) 

rlwnin rA,rS,rB,MB,ME (Rc = 0) 

rlwnm. rA,rS,rB,MB,ME (Rc = 1) 


23 

S 

A 

B 

MB 

ME 



0 5 6 10 11 15 16 20 21 25 26 30 31 


n<- rB [27-31] 
r <- ROTL (rS, n) 
mf- MASK (MB, ME) 
r A <— r & m 


The contents of rS are rotated left the number of bits specified by the low-order five bits of 
rB. A mask is generated having 1 bits from bit MB through bit ME and 0 bits elsewhere. 
The rotated data is ANDed with the generated mask and the result is placed into rA. 

NOTE: rlwnm can be used to extract and rotate bit fields using the methods shown as 
follows: 

• To extract an n-bit field, that starts at variable bit position b in rS, right-justified into 
rA (clearing the remaining 32 - n bits of rA), by setting the low-order five bits of 
rB to b + n, MB = 32 - n, and ME = 31. 

• To extract an n-bit field, that starts at variable bit position b in rS, left-justified into 
rA (clearing the remaining 32 - n bits of rA), by setting the low-order five bits of 
rB to b, MB = 0, and ME = n - 1 . 

• To rotate the contents of a register left (or right) by n bits, by setting the low-order 
five bits of rB to n (32 - n), MB = 0, and ME = 31. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Rc = 1) 

Simplified mnemonics: 

rotlw rA,rS,rB equivalent to rlwnm rA,rS,rB,0,31 
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sc 

System Call (x’4400 0002’) 


SC 


□ Reserved 


17 

00 000 

0 0000 

0 0 0 0 0 0 0 0 0 0 0 0 0 0 

D 

□ 


0 5 6 10 11 15 16 29 30 31 


In the PowerPC UISA, the sc instruction calls the operating system to perform a service. 
When control is returned to the program that executed the system call, the content of the 
registers depends on the register conventions used by the program providing the system 
service. 


This instruction is context synchronizing, as described in Section 4. 1.5.1, “Context 
Synchronizing Instructions.” 


Other registers altered: 


• Dependent on the system service 


8 


In PowerPC OEA, the sc instruction does the following: 


SRR0 4 — iea CIA + 4 
SRR1 [1-4 , 10-15] <- 0 

SRR1 [0,5-9, 16-23, 25-27, 30-31] 4- MSR[0,5-9, 16-23, 25-27, 30-31] 
MSR 4— new_value (see below) 

NIA 4— iea base_ea + OxCOO (see below) 


The EA of the instruction following the sc instruction is placed into SRR0. Bits 0, 5-9,16- 
23, 25-27, and 30-31 of the MSR are placed into the corresponding bits of SRR1, and bits 
1-4 and 10-15 of SRR1 are set to undefined values. 


NOTE: An implementation may define additional MSR bits, and in this case, may also 
cause them to be saved to SRR1 from MSR on an exception and restored to MSR 
from SRR1 on an rfi. 


Then a system call exception is generated. The exception causes the MSR to be altered as 
described in Section 6.4, “Exception Definitions.” 

The exception causes the next instruction to be fetched from offset OxCOO from the physical 
base address determined by the new setting of MSR[IP]. 

Other registers altered: 

• SRR0 

• SRR1 

• MSR 
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SlWx 


SlWx 

Shift Left Word (x’7C00 0030’) 

slw rA,rS,rB (Rc = 0) 

slw. rA,rS,rB (Rc = l) 


31 

S 

A 

B 

24 

Rc 


0 5 6 10 11 15 16 20 21 30 31 


n<r- rB [27-31] 
r <- ROTL (rS, n) 
if rB [ 2 6 ] = 0 

then m <- MASK(0, 31 - n) 
else m <— (32) 0 
r A <r- r & m 


The contents of rS are shifted left the number of bits specified by the low-order five bits of 
rB. Bits shifted out of position 0 are lost. Zeros are supplied to the vacated positions on the 
right. The 32-bit result is placed into rA. However, shift amounts from 32 to 63 give a zero 
result. 


Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Rc = 1) 
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sraw* 

Shift Right Algebraic Word (x’7C00 0630’) 


srawx 


sraw rA,rS,rB (Rc = 0) 

sraw. rA,rS,rB (Rc = l) 


31 

S 

A 

B 

792 

Rc 


0 5 6 10 11 15 16 20 21 30 31 


n<- rB [27-31] 
r 4- ROTL (rS, 32- n ) 
if rB [ 2 6 ] = 0 

then me- MASK (n, 31) 
else m 4— (32) 0 
S 4- rS (0) 

rA 4— r & m (32)S&~ l m 
XER [CA] 4-S & ( (r & -i m) ^ 0 ) 

The contents of rS are shifted right the number of bits specified by the low-order five bits 
of rB (shift amounts between 0-31). Bits shifted out of position 31 are lost. Bit 0 of rS is 
replicated to fill the vacated positions on the left. The 32-bit result is placed into rA. 
XER[CA] is set if rS contains a negative number and any 1 bits are shifted out of position 
31; otherwise XER[CA] is cleared. A shift amount of zero causes rA to receive the 32 bits 
of rS, and XER[CA] to be cleared. However, shift amounts from 32 to 63 give a result of 
32 sign bits, and cause XER[CA] to receive the sign bit of rS. 

NOTE: The sraw instruction, followed by addze, can be used to divide quickly by 2". 
Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Rc = 1) 

• XER: 

Affected: CA 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 


8-168 


PowerPC Microprocessor Family: The Programming Environments 





srawi* 


srawi* 

Shift Right Algebraic Word Immediate (x’7C00 0670’) 

srawi rA,rS,SH (Rc = 0) 

srawi. rA,rS,SH (Rc = l) 


31 

S 

A 

SH 

824 



0 5 6 10 11 15 16 20 21 30 31 


n<r- SH 

r <— ROTL (rS, 32- n) 
m<- MASK (n, 31) 

S <r- rS (0) 

rA f- r & m (32)S & ^ m 
XER [CA] <- S & ( (r & ^ m) ^ 0) 

The contents of rS are shifted right SH bits. Bits shifted out of position 31 are lost. Bit 0 of 
rS is replicated to fill the vacated positions on the left. The result is placed into rA. 
XER[CA] is set if the 32 bits of rS contain a negative number and any 1 bits are shifted out 
of position 3 1 ; otherwise XER[C A] is cleared. A shift amount of zero causes r A to receive 
the value of rS, and XER[CA] to be cleared. 

NOTE: The srawi instruction, followed by addze, can be used to divide quickly by 2". 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Re = 1) 

• XER: 

Affected: CA 
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srwx 

Shift Right Word (x’7C00 0430’) 


srwx 


srw rA,rS,rB (Rc = 0) 

srw. rA,rS,rB (Rc = l) 


31 

S 

A 

B 

536 



0 56 1011 1516 20 21 30 31 


n<r- rB [27-31] 
r<- ROTL (rS, 32- n) 
if rB [ 2 6 ] = 0 

then mf- MASK (n, 31) 
else m <— (32) 0 
r A f— r & m 


The contents of rS are shifted right the number of bits specified by the low-order five bits 
of rB (shift amounts between 0-31). Bits shifted out of position 31 are lost. Zeros are 
supplied to the vacated positions on the left. The 32-bit result is placed into rA. However, 
shift amounts from 32 to 63 give a zero result. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Rc = 1) 
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stb 


stb 

Store Byte (x’9800 0000’) 

stb rS,d(rA) 


38 

S 

A 

d 


0 56 10 11 15 16 31 


if rA = 0 

then b<— 0 
else b 4— (rA) 

EA <r- b + EXTS (d) 

MEM (EA, 1) 4- rS [24-31] 


EA is the sum (rAIO) + d. The contents of the low-order eight bits of rS are stored into the 
byte in memory addressed by EA. 

Other registers altered: 

• None 
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stbu 


stbu 

Store Byte with Update (x’9C00 0000’) 
stbu rS,d(rA) 


39 

S 

A 

d 


0 56 10 11 15 16 31 


EA4- (rA) + EXTS(d) 
MEM (EA, 1 ) <- rS [ 24 - 31 ] 
r A 4— EA 


EA is the sum (rA) + d. The contents of the low-order eight bits of rS are stored into the 
byte in memory addressed by EA. 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 
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stbux 


stbux 

Store Byte with Update Indexed (x’7C00 01 EE’) 
stbux rS,rA,rB 


j | Reserved 


31 

S 

A 

B 

247 


0 56 1011 1516 21 22 30 31 


EA4- (rA) + (rB) 

MEM (EA, 1) 4- rS [24-31] 
r A 4— EA 


EA is the sum (rA) + (rB). The contents of the low-order eight bits of rS are stored into the 
byte in memory addressed by EA. 

EA is placed into rA. 

If rA = 0 , the instruction form is invalid. 

Other registers altered: 

• None 
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stbx 


stbx 

Store Byte Indexed (x’7C00 01 AE’) 
stbx rS,rA,rB 


| | Reserved 


31 

S 

A 

B 

215 

0 


0 56 1011 1516 21 22 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4— b + (rB) 

MEM (EA, 1) 4- rS [24-31] 


EA is the sum (rAIO) + (rB). The contents of the low-order eight bits of rS are stored into 
the byte in memory addressed by EA. 

Other registers altered: 

• None 
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stfd 


stfd 

Store Floating-Point Double (x’D800 0000’) 
stfd frS,d(rA) 


54 

S 

A 

d 


0 56 1011 1516 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 
EA 4- b + EXTS (d) 
MEM (EA, 8) 4- (frS) 


EA is the sum (rAIO) + d. 

The contents of register frS are stored into the double word in memory addressed by EA. 

Other registers altered: 

• None 
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stfdu 


stfdu 

Store Floating-Point Double with Update (x’DCOO 0000’) 
stfdu frS,d(rA) 


55 

S 

A 

d 


0 56 10 11 15 16 31 


EA <r- (rA) + EXTS(d) 
MEM (EA, 8) < — (frS) 
rA <r- EA 


EA is the sum (rA) + d. 

The contents of register frS are stored into the double word in memory addressed by EA. 
EA is placed into rA. 

If rA = 0 , the instruction form is invalid. 

Other registers altered: 

• None 
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stfdux 


stfdux 

Store Floating-Point Double with Update Indexed (x’7C00 05EE’) 
stfdux frS,rA,rB 


| | Reserved 


31 

S 

A 

B 

759 

0 


0 56 1011 1516 20 21 30 31 


EA4- (rA) + (rB) 
MEM (EA, 8) 4- (frS) 
rA 4- EA 


EA is the sum (rA) + (rB). 

The contents of register frS are stored into the double word in memory addressed by EA. 
EA is placed into rA. 

If rA = 0 , the instruction form is invalid. 

Other registers altered: 

• None 
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stfdx 


stfdx 

Store Floating-Point Double Indexed (x’7C00 05AE’) 
stfdx frS,rA,rB 


| | Reserved 


31 

S 

A 

B 

727 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 
EA 4 — b + (rB) 

MEM (EA, 8) 4- (frS) 


EA is the sum (rAIO) + rB. 

The contents of register frS are stored into the double word in memory addressed by EA. 

Other registers altered: 

• None 
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stfiwx 


stfiwx 

Store Floating-Point as Integer Word Indexed (x’7C00 07AE’) 
stfiwx frS,rA,rB 


I | Reserved 


31 

S 

A 

B 

983 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4— b + (rB) 

MEM (EA, 4) 4- frS[ 32-63] 

EA is the sum (rAIO) + (rB). 

The contents of the low-order 32 bits of register frS are stored, without conversion, into the 
word in memory addressed by EA. 

This instruction when preceded by the floating-point convert to integer word (fctiwx) or 
floating-point convert to integer word with round toward zero (fctiwzx) will store the 32- 
bit integer value of a double-precision floating-point number, (see fctiwx and fctiwzx 
instructions) 

If the content of register frS is a double-precision floating point number, the low-order 32 
bits of the 52 bit mantissa are stored, (without the exponent, this could be a meaningless 
value) 

If the contents of register frS were produced, either directly or indirectly, by an lfs 
instruction, a single-precision arithmetic instruction, or frsp, then the value stored is the 
low-order 32 bits of the 52 bit mantissa of the double-precision number, (all single- 
precision floating-point numbers are maintained in double precision format in the floating- 
point register file) 

Other registers altered: 

• None 
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stfs 


stfs 

Store Floating-Point Single (x’DOOO 0000’) 
stfs frS,d(rA) 


52 

S 

A 

d 


0 56 10 11 15 16 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + EXTS (d) 

MEM (EA, 4) 4- SINGLE (frS) 


EA is the sum (rAIO) + d. 

The contents of register frS are converted to single-precision and stored into the word in 
memory addressed by EA. For a discussion on floating-point store conversions, see 
Section D.7, “Floating-Point Store Instructions.” 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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stfsu 


stfsu 

Store Floating-Point Single with Update (x’D400 0000’) 
stfsu frS,d(rA) 


53 

S 

A 

d 


0 56 10 11 15 16 31 


EA A- (rA) + EXTS(d) 

MEM (EA, 4) < — SINGLE (frS) 
r A <— EA 

EA is the sum (rA) + d. 

The contents of frS are converted to single-precision and stored into the word in memory 
addressed by EA. For a discussion on floating-point store conversions, see Section D.7, 
“Floating-Point Store Instructions.” 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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stfsux 


stfsux 

Store Floating-Point Single with Update Indexed (x’7C00 056E’) 
stfsux frS,rA,rB 


| | Reserved 


31 

S 

A 

B 

695 

0 


0 56 1011 1516 20 21 30 31 


EA 4— (rA) + (rB) 

MEM (EA, 4) 4- SINGLE (frS) 
r A 4— EA 

EA is the sum (rA) + (rB). 

The contents of frS are converted to single-precision and stored into the word in memory 
addressed by EA. For a discussion on floating-point store conversions, see Section D.7, 
“Floating-Point Store Instructions.” 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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stfsx 


stfsx 

Store Floating-Point Single Indexed (x’7C00 052E’) 
stfsx frS,rA,rB 


| | Reserved 


31 

S 

A 

B 

663 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then bf- 0 
else b 4— (rA) 

EAf- b + (rB) 

MEM (EA, 4) 4- SINGLE (frS) 


EA is the sum (rAIO) + (rB). 

The contents of register frS are converted to single-precision and stored into the word in 
memory addressed by EA. For a discussion on floating-point store conversions, see 
Section D.7, “Floating-Point Store Instructions.” 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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sth sth 

Store Half Word (x’BOOO OOOO’) 
sth rS,d(rA) 


44 

S 

A 

d 


0 56 10 11 15 16 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + EXTS (d) 

MEM (EA, 2) 4- rS [16-31] 


EA is the sum (rAIO) + d. The contents of the low-order 16 bits of rS are stored into the half 
word in memory addressed by EA. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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sthbrx 


sthbrx 

Store Half Word Byte-Reverse Indexed (x’7C00 072C’) 
sthbrx rS,rA,rB 


| | Reserved 


31 

S 

A 

B 

918 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4 — b + (rB) 

MEM (EA, 2) 4- rS [24-31] II rS [16-23] 


EA is the sum (rAIO) + (rB). The contents of the low-order eight bits (24-31) of rS are 
stored into bits 0-7 of the half word in memory addressed by EA. The contents of the 
subsequent low-order eight bits (16-23) of rS are stored into bits 8-15 of the half word in 
memory addressed by EA. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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sthu 


sthu 

Store Half Word with Update (x’B400 0000’) 
sthu rS,d(rA) 


45 

S 

A 

d 


0 56 10 11 15 16 31 


EAf- (rA) + EXTS(d) 

MEM (EA, 2) 4- rS [16-31] 
r A 4— EA 

EA is the sum (rA) + d. The contents of the low-order 16 bits of rS are stored into the half 
word in memory addressed by EA. 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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sthux 


sthux 

Store Half Word with Update Indexed (x’7C00 036E’) 
sthux rS,rA,rB 


| | Reserved 


31 

S 

A 

B 

439 

0 


0 56 1011 1516 20 21 30 31 

EA 4— (rA) + (rB) 

MEM (EA, 2) 4- rS [16-31] 
r A 4— EA 

EA is the sum (rA) + (rB). The contents of the low-order 16 bits of rS are stored into the 
half word in memory addressed by EA. 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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sthx 


sthx 

Store Half Word Indexed (x’7C00 032E’) 
sthx rS,rA,rB 


| | Reserved 


31 

S 

A 

B 

407 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA b + (rB) 

MEM (EA, 2) 4- rS [16-31] 


EA is the sum (rAIO) + (rB). The contents of the low-order 16 bits of rS are stored into the 
half word in memory addressed by EA. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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stmw 

Store Multiple Word (x’BCOO 0000’) 


stmw 


stmw 


rS,d(rA) 


47 

S 

A 

d 


0 56 10 11 15 16 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4- b + EXTS (d) 
r 4— rS 

do while r < 31 

MEM (EA, 4) 4- GPR(r) 
r 4— r + 1 
EA4- EA + 4 

EA is the sum (rAIO) + d. 
n = (32 - rS). 

n consecutive words starting at EA are stored from the GPRs rS through r31. For example, 
if rS = 30, 2 words are stored. 

EA must be a multiple of four. If it is not, either the system alignment exception handler is 
invoked or the results are boundedly undefined. For additional information about alignment 
and DSI exceptions, see Section 6.4.3, “DSI Exception (0x00300).” 

NOTE: In some implementations, this instruction is likely to have a greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual store 
instructions that produce the same results. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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StSWI 

Store String Word Immediate (x’7C00 05AA’) 


StSWI 


stswi 


rS,rA,NB 


j | Reserved 


31 

S 

A 

NB 

725 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then EA 4— 0 
else EA 4— (rA) 
if NB = 0 

then n 4— 32 
else n 4— NB 
r 4- rS - 1 
i 4 — 0 

do while n > 0 

if i = 0 

then r 4— r + 1 (mod 32) 
MEM (EA, 1) 4- GPR(r) [i, i+7] 
i 4- i + 8 
if i = 32 

then i 4— 0 
EA4- EA + 1 
n 4- n- 1 


EA is (rAIO). Let n = NB if NB ± 0, n = 32 if NB = 0; n is the number of bytes to store. Let 
nr = CEIL(n / 4);nr is the number of registers to supply data. 

n consecutive bytes starting at EA are stored from GPRs rS through rS + nr - 1 . Bytes are 
stored left to right from each register. The sequence of registers wraps around through rO if 
required. 

Under certain conditions (for example, segment boundary crossing) the data alignment 
exception handler may be invoked. For additional information about data alignment 
exceptions, see Section 6.4.3, “DSI Exception (0x00300).” 

NOTE: In some implementations, this instruction is likely to have a greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual store 
instructions that produce the same results. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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stswx 

Store String Word Indexed (x’7C00 052A’) 


stswx 


stswx 


rS,rA,rB 


| | Reserved 


31 

S 

A 

B 

661 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4 — b + (rB) 
n 4- XER[25-31] 
r 4- rS - 1 
i 4 — 0 

do while n > 0 

if i = 0 

then r 4— r + 1 (mod 32) 
MEM (EA, 1) 4— GPR (r) [i, i+7] 
i 4- i + 8 
if i = 32 

then i 4— 0 
EA4- EA + 1 
0 4— n- 1 


EA is the sum (rAIO) + (rB). Let n = XER[25-31]; n is the number of bytes to store. Let 
nr = CEIL(n / 4);nr is the number of registers to supply data. 

n consecutive bytes starting at EA are stored from GPRs rS through rS + nr - 1 . Bytes are 
stored left to right from each register. The sequence of registers wraps around through rO if 
required. If n = 0, no bytes are stored. 

Under certain conditions (for example, segment boundary crossing) the data alignment 
exception handler may be invoked. For additional information about data alignment 
exceptions, see Section 6.4.3, “DSI Exception (0x00300).” 

NOTE: In some implementations, this instruction is likely to have a greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual store 
instructions that produce the same results. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 
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stw 

Store Word (x’9000 0000’) 


stw 


stw 


rS,d(rA) 


36 

S 

A 

d 


0 56 10 11 15 16 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 
EA 4- b + EXTS (d) 
MEM (EA, 4) 4- rS 


EA is the sum (rAIO) + d. The contents of rS are stored into the word in memory addressed 
by EA. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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stwbrx 


stwbrx 

Store Word Byte-Reverse Indexed (x’7C00 052C’) 
stwbrx rS,rA,rB 


| | Reserved 


31 

S 

A 

B 

662 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then b 0 
else b <— (rA) 

EAA- b + (rB) 

MEM (EA, 4 ) <— rS [24-31 ] || rS [16-23] II rS[8-15] II rS[0-7] 

EA is the sum (rAIO) + (rB). The contents of the low-order eight bits (24-31) of rS are 
stored into bits 0-7 of the word in memory addressed by EA. The contents of the 
subsequent eight low-order bits (16-23) of rS are stored into bits 8-15 of the word in 
memory addressed by EA. The contents of the subsequent eight low-order bits (8-15) of rS 
are stored into bits 16-23 of the word in memory addressed by EA. The contents of the 
subsequent eight low-order bits (0-7) of rS are stored into bits 24-31 of the word in 
memory addressed by EA. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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stwcx. 

Store Word Conditional Indexed (x’7C00 01 2D’) 


stwcx 


stwcx. 


rS,rA,rB 


31 

S 

A 

B 

150 

1 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 

EA 4— b + (rB) 
if RESERVE 

then 

MEM (EA, 4) 4— (rS) 

CRO 4- ObOO || Obi || XER [SO] 

RESERVE 4- 0 
else 

CRO 4— ObOO || ObO || XER[SO] 

EA is the sum (rAIO) + (rB). If the reserved bit is set, the stwcx. instruction stores rS to 
effective address (rA + rB), clears the reserved bit, and sets CR0[EQ]. If the reserved bit 
is not set, the stwcx. instruction does not do a store; it leaves the reserved bit cleared and 
clears CR0[EQ]. Software must look at CR0[EQ] to see if the stwcx. was successful. 

The reserved bit is set by the lwarx instruction. The reserved bit is cleared by any stwcx. 
instruction to any address, and also by snooping logic if it detects that another processor 
does any kind of write or invalidate to the block indicated in the reservation buffer when 
reserved is set. 

EA must be a multiple of four. If it is not, either the system alignment exception handler is 
invoked or the results are boundedly undefined. For additional information about alignment 
and DSI exceptions, see Section 6.4.3, “DSI Exception (0x00300).” 

The granularity with which reservations are managed is implementation-dependent. 
Therefore, the memory to be accessed by the load and reserve and store conditional 
instructions should be controlled by a system library program. 

Because the hardware doesn’t compare reservation address when executing the stwcx. 
instruction, operating systems software MUST reset the reservation if an exception or other 
type of interrupt occurs to insure atomic memory references of lwarx and stwcx. pairs. 
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Other registers altered: 

• CRO field is set to reflect whether the store operation was performed as follows: 
CR0[LT GT EQ SO] = ObOO II store_performed II XER[SO] 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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stwu 

Store Word with Update (x’9400 0000’) 


stwu 


stwu 


rS,d(rA) 


37 

S 

A 

d 


0 56 10 11 15 16 31 


EA 4— (rA) + EXTS(d) 
MEM (EA, 4) 4— (rS) 
r A 4— EA 


EA is the sum (rA) + d. The contents of rS are stored into the word in memory addressed 
by EA. 

EA is placed into rA. 

If rA = 0, the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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stwux 

Store Word with Update Indexed (x’7C00 01 6E’) 


stwux 


stwux 


rS,rA,rB 


j | Reserved 


31 

S 

A 

B 

183 

0 


0 56 1011 1516 20 21 30 31 

EA <— (rA) + (rB) 

MEM (EA, 4) <r-( rS) 
r A <— EA 


EA is the sum (rA) + (rB). The contents of rS are stored into the word in memory addressed 
by EA. 

EA is placed into rA. 

If rA = 0 , the instruction form is invalid. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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stwx 

Store Word Indexed (x’7C00 01 2E’) 


stwx 


stwx 


rS,rA,rB 


□ Reserved 


31 

S 

A 

B 

151 

0 


0 56 1011 1516 20 21 30 31 


if rA = 0 

then b 4— 0 
else b 4— (rA) 
EA 4 — b + (rB) 

MEM (EA, 4) 4— (rS) 


EA is the sum (rAIO) + (rB). The contents of rS are stored into the word in memory 
addressed by EA. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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subfx 


subfx 

Subtract From (x’7C00 0050’) 

(OE = 0 Rc = 0) 
(OE = ORc = 1) 
(OE = 1 Rc = 0) 
(OE = 1 Rc = 1) 


subf 

rD,rA,rB 

subf. 

rD,rA,rB 

subfo 

rD,rA,rB 

subfo. 

rD,rA,rB 


31 

D 

A 

B 


56 10 11 

rD 4— - 1 (rA) + (rB) + 1 


15 16 


20 21 22 


30 31 


The sum -■ (rA) + (rB) + 1 is placed into rD. (equivlent to (rB)— (rA)) 

The subf instruction is preferred for subtraction because it sets few status bits. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Rc = 1) 

• XER: 

Affected: SO, OV (if OE = 1) 

Simplified mnemonics: 

sub rD,rA,rB equivalent to subf rD,rB,rA 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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subfcx subfcx 

Subtract from Carrying (x’7C00 0010’) 


subfc 

subfc. 

subfco 

subfco. 

rD,rA,rB 

rD,rA,rB 

rD,rA,rB 

rD,rA,rB 

(OE = 0 Re = 0) 

(OE = 0 Re = 1) 

(OE = 1 Re = 0) 

(OE = 1 Re = 1) 



31 

D A 


8 



0 5 6 10 11 15 16 20 21 22 30 31 

rD (rA) + (rB) + 1 


The sum -> (rA) + (rB) + 1 is placed into rD. (equivlent to (rB)— (rA)) 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Re = 1) 

Note: CR0 field may not reflect the infinitely precise result if overflow occurs (see 
XER below). 

• XER: 

Affected: CA 

Affected: SO, OV (if OE = 1) 

Note: The setting of the affected bits in the XER reflects overflow of the 32-bit 
results. For further information see Chapter 3, “Operand Conventions.” 

Simplified mnemonics: 

subc rD,rA,rB equivalent to subfc rD,rB,rA 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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subfex 


subfe 

rD,rA,rB 

(OE = 0 Re = 0) 

subfe. 

rD,rA,rB 

(OE = 0 Re = 1) 

subfeo 

rD,rA,rB 

(OE = 1 Re = 0) 

subfeo. 

rD,rA,rB 

(OE = 1 Re = 1) 


subfex 

Subtract from Extended (x’7C00 0110’) 


31 

D 

A 

B 

m 

136 


0 

5 

6 

10 

11 

15 

16 


20 

21 

22 


30 31 


rD <— 

(rA) 

+ (rB) + 

XER [CA] 











The sum -■ (rA) + (rB) + XER[CA] is placed into rD. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (if Re = 1) 

Note: CRO field may not reflect the infinitely precise result if overflow occurs (Note: 
See Chapter 3, “Operand Conventions” for setting of affected bits). 

• XER: 

Affected: CA 

Affected: SO, OV (if OE = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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subfic subfic 

Subtract from Immediate Carrying (x’2000 0000’) 
subfic rD,rA,SIMM 


08 

D 

A 

SIMM 


0 56 10 11 15 16 31 


rD A- -■ (rA) + EXTS (SIMM) + 1 

The sum -■ (rA) + EXTS(SIMM) + 1 is placed into rD.(equivlent to EXTS(SIMM)-(rA)) 

Other registers altered: 

• XER: 

Affected: CA 

Note: See Chapter 3, “Operand Conventions.” 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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subfmex 

Subtract from Minus One Extended (x’7C00 01 DO’) 


subfme 

rD,rA 

(OE = 0 Re = 

subfme. 

rD,rA 

(OE = 0 Re = 

subfmeo 

rD,rA 

(OE = 1 Re = 

subfmeo. 

rD,rA 

(OE = 1 Re = 


subfmex 


j | Reserved 


31 

D 

A 

0000 0 

m 

232 


0 

5 

6 

10 

11 

15 

16 20 

21 

22 


30 31 


rD 4- 

(rA) 

+ XER [CA] 

- 1 









The sum -■ (rA) + XER[CA] + (32)1 is placed into rD. 

Other registers altered: 

• Condition Register (CRO field): 

Affected: LT, GT, EQ, SO (if Re = 1) 

Note: CRO field may not reflect the infinitely precise result if overflow occurs (See 
Chapter 3, “Operand Conventions.” 

• XER: 

Affected: CA 

Affected: SO, OV (if OE = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 
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subfzex 


subfze 

rD,rA 

(OE = 0 Re = 0) 

subfze. 

rD,rA 

(OE = 0 Re = 1) 

subfzeo 

rD,rA 

(OE = 1 Re = 0) 

subfzeo. 

rD,rA 

(OE = 1 Re = 1) 


subfzex 

Subtract from Zero Extended (x’7C00 0190’) 


| | Reserved 


31 

D 

A 

0 0 0 0 0 

S 

200 



0 5 6 10 11 15 16 20 21 22 30 31 

rD <- -i (rA) + XER[CA] 


The sum -> (rA) + XER[CA] is placed into rD. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Re = 1) 

Note: CR0 field may not reflect the infinitely precise result if overflow occurs (see 
XER below). 

• XER: 

Affected: CA 

Affected: SO, OV (if OE = 1) 

Note: See Chapter 3, “Operand Conventions.” 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



XO 


8-204 


PowerPC Microprocessor Family: The Programming Environments 










sync 


sync 

Synchronize (x’7C00 04AC’) 


| | Reserved 


31 

0 0 0 0 0 

0 0000 

0 0 0 0 0 

598 

0 


0 5 6 10 11 15 16 20 21 30 31 


The sync instruction provides an ordering function for the effects of all instructions 
executed by a given processor. Executing a sync instruction ensures that all instructions 
preceding the sync instruction appear to have completed before the sync instruction 
completes, and that no subsequent instructions are initiated by the processor until after the 
sync instruction completes. When the sync instruction completes, all external accesses 
caused by instructions preceding the sync instruction will have been performed with 
respect to all other mechanisms that access memory. For more information on how the sync 
instruction affects the VEA, refer to Chapter 5, “Cache Model and Memory Coherency.” 

Multiprocessor implementations also send a sync address-only broadcast that is useful in 
some designs. For example, if a design has an external buffer that re-orders loads and stores 
for better bus efficiency, the sync broadcast signals to that buffer that previous loads/stores 
must be completed before any following loads/stores. 

The sync instruction can be used to ensure that the results of all stores into a data structure, 
caused by store instructions executed in a “critical section” of a program, are seen by other 
processors before the data structure is seen as unlocked. 

The functions performed by the sync instruction will normally take a significant amount of 
time to complete, so indiscriminate use of this instruction may adversely affect 
performance. In addition, the time required to execute sync may vary from one execution 
to another. 

The eieio instruction may be more appropriate than sync for many cases. 

This instruction is execution synchronizing. For more information on execution 
synchronization, see Section 4.1.5, “Synchronizing Instructions.” 


Other registers altered: 
• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 
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tibia 


tibia 

Translation Lookaside Buffer Invalidate All (x’7C00 02E4’) 

tibia 


| | Reserved 


31 

0 0 0 0 0 

0 0000 

00 000 

370 

B 


0 5 6 10 11 15 16 20 21 30 31 


All TLB entries <— invalid 


The entire translation lookaside buffer (TLB) is invalidated (that is, all entries are 
removed). 

The TLB is invalidated regardless of the settings of MSRfIR] and MSR[DR], The 
invalidation is done without reference to the segment registers. 

This instruction does not cause the entries to be invalidated in other processors. 

This is a supervisor-level instruction and optional in the PowerPC architecture. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

YES 

YES 

X 
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tlbie 


tlbie 

Translation Lookaside Buffer Invalidate Entry (x’7C00 0264’) 

tlbie rB 


| | Reserved 


31 

0 0 0 0 0 

0 0000 

B 

306 

0 


0 5 6 10 11 15 16 20 21 30 31 

VPS <T- rB [4-19] 

Identify TLB entries corresponding to VPS 
Each such TLB entry <— invalid 


EA is the contents of rB. If the translation lookaside buffer (TLB) contains an entry 
corresponding to EA, that entry is made invalid (that is, removed from the TLB). 

Multiprocessing implementations (for example, the 601, and 604) send a tlbie address-only 
broadcast over the address bus to tell other processors to invalidate the same TLB entry in 
their TLBs. 

The TLB search is done regardless of the settings of MSR[IR] and MSR[DR], The search 
is done based on a portion of the logical page number within a segment, without reference 
to the segment registers. All entries matching the search criteria are invalidated. 

Block address translation for EA, if any, is ignored. Refer to Section 7. 5.3.4, 
“Synchronization of Memory Accesses and Referenced and Changed Bit Updates,” and 
Section 7.6.3, “Page Table Updates,” for other requirements associated with the use of this 
instruction. 

This is a supervisor-level instruction and optional in the PowerPC architecture. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

YES 

YES 

X 
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tlbsync tlbsync 

TLB Synchronize (x’7C00 046C’) 


[ | Reserved 


31 

0 0 0 0 0 

0 0000 

0 0 0 0 0 

566 

0 


0 5 6 10 11 15 16 20 21 30 31 


If an implementation sends a broadcast for tlbie then it will also send a broadcast for 
tlbsync. Executing a tlbsync instruction ensures that all tlbie instructions previously 
executed by the processor executing the tlbsync instruction have completed on all other 
processors. 

The operation performed by this instruction is treated as a caching-inhibited and guarded 
data access with respect to the ordering done by eieio. 

NOTE: The 601 expands the use of the sync instruction to cover tlbsync functionality. 

Refer to Section 7. 5. 3.4, “Synchronization of Memory Accesses and Referenced and 
Changed Bit Updates,” and Section 7.6.3, “Page Table Updates,” for other requirements 
associated with the use of this instruction. 

This instruction is supervisor-level and optional in the PowerPC architecture. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

OEA 

YES 

YES 

X 
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tw 


tw 

Trap Word (x’7C00 0008’) 


tw 

TO,rA,rB 




| | Reserved 



a <T- EXTS (rA) 
b <- EXTS (rB) 

if (a < b) & TO[0] then TRAP 

if (a > b) & T0[1] then TRAP 

if (a = b) & TO [2] then TRAP 

if (a <U b) & TO [3] then TRAP 
if (a >U b) & TO [4] then TRAP 


The contents of rA are compared arithmetically with the contents of rB for TO[0, 1,2]. The 
contents of rA are compared logically with the contents of rB for TO[3, 4], If any bit in the 
TO field is set and its corresponding condition is met by the result of the comparison, then 
the system trap handler is invoked. 

Other registers altered: 

• None 

Simplified mnemonics: 

tweq rA,rB 
twlge rA,rB 
trap 


equivalent to 

tw 

4,rA,rB 

equivalent to 

tw 

5,rA,rB 

equivalent to 

tw 

31,0,0 
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twi 

Trap Word Immediate (x’OCOO 0000’) 


twi 


twi TO, rA, SIMM 


03 

TO 

A 

SIMM 


0 56 10 11 15 16 31 


a <T- EXTS (rA) 

if (a < EXTS (SIMM) ) & TO[0] then TRAP 

if (a > EXTS (SIMM) ) & TO[l] then TRAP 

if (a = EXTS (SIMM) ) & TO[2] then TRAP 

if (a <U EXTS (SIMM)) & TO [3] then TRAP 
if (a >U EXTS (SIMM)) & TO [4] then TRAP 


The contents of rA are compared arithmetically with the sign-extended value of the SIMM 
field for TO[0, 1,2]. The contents of rA are compared logically with the sign-extended 
value of the SIMM field for TO[3, 4], If any bit in the TO field is set and its corresponding 
condition is met by the result of the comparison, then the system trap handler is invoked. 

Other registers altered: 

• None 

Simplified mnemonics: 

twgti rA, value 
twllei rA, value 


equivalent to 
equivalent to 


twi 8, rA, value 
twi 6, rA, value 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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xor* 

XOR (x’7C00 0278’) 


xor* 


xor rA,rS,rB (Rc = 0) 

xor. rA,rS,rB (Rc = 1) 


31 

S 

A 

B 

316 



0 5 6 10 11 15 16 20 21 30 31 


rAf- (rS) © (rB) 

The contents of rS are XORed with the contents of rB and the result is placed into rA. 

Other registers altered: 

• Condition Register (CR0 field): 

Affected: LT, GT, EQ, SO (if Rc = 1) 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



X 
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xori xori 

XOR Immediate (x’6800 0000’) 
xori rA,rS,UIMM 


26 

S 

A 

UIMM 


0 56 10 11 15 16 31 


r A <— (rS) © ((16)0 || UIMM) 

The contents of rS are XORed with 0x0000 II UIMM and the result is placed into rA. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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xoris xoris 

XOR Immediate Shifted (x’6C00 0000’) 
xoris rA,rS,UIMM 


27 

S 

A 

UIMM 


0 56 10 11 15 16 31 


r A (rS) © (UIMM || (16)0) 

The contents of rS are XORed with UIMM II 0x0000 and the result is placed into rA. 

Other registers altered: 

• None 


PowerPC Architecture Level 

Supervisor Level 

PowerPC Optional 

Form 

UISA 



D 
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Appendix A. PowerPC Instruction Set 
Listings 

This appendix lists the PowerPC architecture’s instruction set. Instructions are sorted by 
mnemonic, opcode, function, and form. Also included in this appendix is a quick reference 
table that contains general information, such as the architecture level, privilege level, and 
form, and indicates if the instruction is optional. 

Note that split fields, which represent the concatenation of sequences from left to right, are 
shown in lowercase. For more information refer to Chapter 8, “Instruction Set.” 

A.1 Instructions Sorted by Mnemonic 

Table A-l lists the instructions implemented in the PowerPC architecture in alphabetical 
order by mnemonic. 

Key: 

Reserved bits 


Table A-1. Complete Instruction List Sorted by Mnemonic 


Name 

0 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 

22 23 24 25 26 27 28 29 30 

31 

addx 

31 

D 

A 

B 

OE 

266 

Rc 

addcx 

31 

D 

A 

B 

OE 

10 

Rc 

addex 

31 

D 

A 

B 

OE 

138 

Rc 

addi 

14 

D 

A 

SIMM 

addic 

12 

D 

A 

SIMM 

addic. 

13 

D 

A 

SIMM 

addis 

15 

D 

A 

SIMM 

addmex 

31 

D 

A 

0 0 0 0 0 

OE 

234 

Rc 

addzex 

31 

D 

A 

0 0 0 0 0 

OE 

202 

Rc 

andx 

31 

S 

A 

B 

28 

Rc 

andcx 

31 

S 

A 

B 

60 

Rc 

andi. 

28 

S 

A 

UIMM 
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Name 0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


andis. 

29 

S 

A 

UIMM 

bx 

18 

LI 



bcx 

16 

BO 

Bl 

BD 



bcctrx 

19 

BO 

Bl 

0 0 0 0 0 

528 

LK 

bclrx 

19 

BO 

Bl 

0 0 0 0 0 

16 

LK 

cmp 

31 

crfD 

0 

L 

A 

B 

0 

0 

cmpi 

11 

crfD 

0 

L 

A 

SIMM 

cmpl 

31 

crfD 

0 

L 

A 

B 

32 

0 

cmpli 

10 

crfD 

0 

L 

A 

UIMM 

cntlzwx 

31 

S 

A 

0 0 0 0 0 

26 

Rc 

crand 

19 

crbD 

crbA 

crbB 

257 

0 

crandc 

19 

crbD 

crbA 

crbB 

129 

0 

creqv 

19 

crbD 

crbA 

crbB 

289 

0 

crnand 

19 

crbD 

crbA 

crbB 

225 

0 

crnor 

19 

crbD 

crbA 

crbB 

33 

0 

cror 

19 

crbD 

crbA 

crbB 

449 

0 

crorc 

19 

crbD 

crbA 

crbB 

417 

0 

crxor 

19 

crbD 

crbA 

crbB 

193 

0 

dcba 1 

31 

0 0 0 0 0 

A 

B 

758 

0 

debt 

31 

0 0 0 0 0 

A 

B 

86 

0 

debi 2 

31 

0 0 0 0 0 

A 

B 

470 

0 

debst 

31 

0 0 0 0 0 

A 

B 

54 

0 

debt 

31 

0 0 0 0 0 

A 

B 

278 

0 

debtst 

31 

0 0 0 0 0 

A 

B 

246 

0 

debz 

31 

0 0 0 0 0 

A 

B 

1014 

0 

divwx 

31 

D 

A 

B 

OE 

491 

Rc 

divwux 

31 

D 

A 

B 

OE 

459 

Rc 

eciwx 

31 

D 

A 

B 

310 

0 

ecowx 

31 

S 

A 

B 

438 

0 

eieio 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

854 

0 

eqvx 

31 

S 

A 

B 

284 

Rc 

extsbx 

31 

s 

A 

0 0 0 0 0 

954 

Rc 

extshx 

31 

s 

A 

0 0 0 0 0 

922 

Rc 
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Name 0 


6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


fabsx 

63 

D 

0 0 0 0 0 

B 

264 

Rc 

faddx 

63 

D 

A 

B 

0 0 0 0 0 

21 

Rc 

faddsx 

59 

D 

A 

B 

0 0 0 0 0 

21 

Rc 

tempo 

63 

crfD 

00 

A 

B 

32 

0 

fempu 

63 

crfD 

00 

A 

B 

0 

0 

fetiwx 

63 

D 

0 0 0 0 0 

B 

14 

Rc 

fctiwzx 

63 

D 

0 0 0 0 0 

B 

15 

Rc 

fdivx 

63 

D 

A 

B 

0 0 0 0 0 

18 

Rc 

fdivsx 

59 

D 

A 

B 

0 0 0 0 0 

18 

Rc 

fmaddx 

63 

D 

A 

B 

C 

29 

Rc 

fmaddsx 

59 

D 

A 

B 

C 

29 

Rc 

fmrx 

63 

D 

0 0 0 0 0 

B 

72 

Rc 

fmsubx 

63 

D 

A 

B 

C 

28 

Rc 

fmsubsx 

59 

D 

A 

B 

C 

28 

Rc 

fmulx 

63 

D 

A 

0 0 0 0 0 

C 

25 

Rc 

fmulsx 

59 

D 

A 

0 0 0 0 0 

C 

25 


fnabsx 

63 

D 

0 0 0 0 0 

B 

136 


fnegx 

63 

D 

0 0 0 0 0 

B 

40 

Rc 

fnmaddx 

63 

D 

A 

B 

C 

31 

Rc 

fnmaddsx 

59 

D 

A 

B 

C 

31 

Rc 

fnmsubx 

63 

D 

A 

B 

C 

30 

Rc 

fnmsubsx 

59 

D 

A 

B 

C 

30 

Rc 

fresx 1 

59 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

24 

Rc 

frspx 

63 

D 

0 0 0 0 0 

B 

12 

Rc 

frsqrtex 1 

63 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

26 

Rc 

fselx 1 

63 

D 

A 

B 

C 

23 

Rc 

fsqrtx 1 

63 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

22 

Rc 

fsqrtsx 1 

59 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

22 

Rc 

fsubx 

63 

D 

A 

B 

0 0 0 0 0 

20 

Rc 

fsubsx 

59 

D 

A 

B 

0 0 0 0 0 

20 

Rc 

iebi 

31 

0 0 0 0 0 

A 

B 

982 

0 

isync 

19 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

150 

0 

Ibz 

34 

D 

A 

d 
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Name 0 


6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


Ibzu 

35 

D 

A 

d 

Ibzux 

31 

D 

A 

B 

119 

0 

Ibzx 

31 

D 

A 

B 

87 

0 

ltd 

50 

D 

A 

d 

Ifdu 

51 

D 

A 

d 

Ifdux 

31 

D 

A 

B 

631 

0 

Ifdx 

31 

D 

A 

B 

599 

0 

Its 

48 

D 

A 

d 

Ifsu 

49 

D 

A 

d 

Ifsux 

31 

D 

A 

B 

567 

0 

Ifsx 

31 

D 

A 

B 

535 

0 

lha 

42 

D 

A 

d 

lhau 

43 

D 

A 

d 

lhaux 

31 

D 

A 

B 

375 

0 

lhax 

31 

D 

A 

B 

343 

0 

Ihbrx 

31 

D 

A 

B 

790 

0 

Ihz 

40 

D 

A 

d 

Ihzu 

41 

D 

A 

d 

Ihzux 

31 

D 

A 

B 

311 

0 

Ihzx 

31 

D 

A 

B 

279 

0 

Imw 3 

46 

D 

A 

d 

Iswi 3 

31 

D 

A 

NB 

597 

0 

Iswx 3 

31 

D 

A 

B 

533 

0 

Iwarx 

31 

D 

A 

B 

20 

0 

Iwbrx 

31 

D 

A 

B 

534 

0 

Iwz 

32 

D 

A 

d 

Iwzu 

33 

D 

A 

d 

Iwzux 

31 

D 

A 

B 

55 

0 

Iwzx 

31 

D 

A 

B 

23 

0 

mcrf 

19 

crfD 

00 

crfS 

0 0 

0 0 0 0 0 

0 

0 

mcrfs 

63 

crfD 

00 

crfS 

0 0 

0 0 0 0 0 

64 

0 

mcrxr 

31 

crfD 

00 

0 0 0 0 0 

0 0 0 0 0 

512 

0 

mfcr 

31 

D 

0 0 0 0 0 

0 0 0 0 0 

19 

0 
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Name 0 


6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


mffsx 
mfmsr 2 
mfspr 4 
mfsr 2 
mfsrin 2 
mftb 
mtcrf 
mtfsbOx 
mtfsblx 
mtfsfx 
mtfsfix 
mtmsr 2 
mtspr 4 
mtsr 2 
mtsrin 2 
mulhwx 
mulhwux 
mulli 
mullwx 
nandx 
negx 
norx 
orx 
orcx 
ori 
oris 
rfi 2 
rlwimix 
rlwinmx 
rlwnmx 
sc 
slwx 
srawx 


63 

D 

0 0 0 0 0 

0 0 0 0 0 

583 

Rc 

31 

D 

0 0 0 0 0 

0 0 0 0 0 

83 

0 

31 

D 

spr 

339 

0 

31 

D 

0 

SR 

0 0 0 0 0 

595 

0 

31 

D 

0 0 0 0 0 

B 

659 

0 

31 

D 

tbr 

371 

0 

31 

S 

0 

CRM 

0 

144 

0 

63 

crbD 

0 0 0 0 0 

0 0 0 0 0 

70 

Rc 

63 

crbD 

0 0 0 0 0 

0 0 0 0 0 

38 

Rc 

63 

0 FM 0 

B 

711 

Rc 

63 

crfD 0 0 

0 0 0 0 0 


D 

134 

Rc 

31 

S 

0 0 0 0 0 

0 0 0 0 0 

146 

0 

31 

S 

spr 

467 

0 

31 

S 

0 

SR 

0 0 0 0 0 

210 

0 

31 

S 

0 0 0 0 0 

B 

242 

0 

31 

D 

A 

B 

0 

75 


31 

D 

A 

B 

0 

11 


7 

D 

A 

SIMM 

31 

D 

A 

B 

OE 

235 

Rc 

31 

S 

A 

B 

476 

Rc 

31 

D 

A 

0 0 0 0 0 

OE 

104 

Rc 

31 

S 

A 

B 

124 

Rc 

31 

S 

A 

B 

444 

Rc 

31 

S 

A 

B 

412 

Rc 

24 

S 

A 

UIMM 

25 

S 

A 

UIMM 

19 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

50 

0 

20 

S 

A 

SH 

MB 

■a 

Rc 

21 

S 

A 

SH 

MB 


Rc 

23 

S 

A 

B 

MB 

ME 

Rc 

17 

0 0 0 0 0 

0 0 0 0 0 

00000000000000 1 

0 

31 

S 

A 

B 

24 

Rc 

31 

S 

A 

B 

792 

Rc 
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Name 0 


6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


srawix 

31 

S 

A 

SH 

824 

Rc 

srwx 

31 

S 

A 

B 

536 

Rc 

stb 

38 

S 

A 

d 

stbu 

39 

S 

A 

d 

stbux 

31 

S 

A 

B 

247 

0 

stbx 

31 

S 

A 

B 

215 

0 

stfd 

54 

S 

A 

d 

stfdu 

55 

S 

A 

d 

stfdux 

31 

S 

A 

B 

759 

0 

stfdx 

31 

S 

A 

B 

727 

0 

stfiwx 1 

31 

S 

A 

B 

983 

0 

stfs 

52 

S 

A 

d 

stfsu 

53 

S 

A 

d 

stfsux 

31 

S 

A 

B 

695 

0 

stfsx 

31 

S 

A 

B 

663 

0 

sth 

44 

S 

A 

d 

sthbrx 

31 

S 

A 

B 

918 

0 

sthu 

45 

S 

A 

d 

sthux 

31 

S 

A 

B 

439 

0 

sthx 

31 

S 

A 

B 

407 

0 

stmw 3 

47 

S 

A 

d 

stswi 3 

31 

S 

A 

NB 

725 

0 

stswx 3 

31 

S 

A 

B 

661 

0 

stw 

36 

S 

A 

d 

stwbrx 

31 

S 

A 

B 

662 

0 

stwcx. 

31 

S 

A 

B 

150 

1 

stwu 

37 

s 

A 

d 

stwux 

31 

s 

A 

B 

183 

0 

stwx 

31 

s 

A 

B 

151 

0 

subfx 

31 

D 

A 

B 

OE 

40 

Rc 

subfcx 

31 

D 

A 

B 

OE 

8 

Rc 

subfex 

31 

D 

A 

B 

OE 

136 

Rc 

subfic 

08 

D 

A 

SIMM 
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Name 0 


6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


subfmex 

31 

D 

A 

0 0 0 0 0 

OE 

232 

Rc 

subfzex 

31 

D 

A 

0 0 0 0 0 

OE 

200 

Rc 

sync 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

598 

0 

tibia 12 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

370 

0 

tlbie 12 

31 

0 0 0 0 0 

0 0 0 0 0 

B 

306 

0 

tlbsync 12 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

566 

0 

tw 

31 

TO 

A 

B 

4 

0 

twi 

03 

TO 

A 

SIMM 

xorx 

31 

S 

A 

B 

316 

Rc 

xori 

26 

S 

A 

UIMM 

xoris 

27 

S 

A 

UIMM 


Notes: 

1 Optional instruction 

2 Supervisor-level instruction 

3 Load/store string/multiple instruction 

4 Supervisor- and user-level instruction 


A 


Appendix A. PowerPC Instruction Set Listings 


A-7 













A. 2 Instructions Sorted by Opcode 

Table A-2 lists the instructions defined in the PowerPC architecture in numeric order by 
opcode. 

Key: Reserved bits 


Table A-2. Complete Instruction List Sorted by Opcode 


Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


twi 

0 0 0 0 1 1 

TO 

A 

SIMM 

mulli 

000111 

D 

A 

SIMM 

subfic 

001000 

D 

A 

SIMM 

cmpli 

001010 

crfD 

0 

L 

A 

UIMM 

cmpi 

001011 

crfD 

0 

L 

A 

SIMM 

addic 

001100 

D 

A 

SIMM 

addic. 

001101 

D 

A 

SIMM 

addi 

001110 

D 

A 

SIMM 

addis 

001111 

D 

A 

SIMM 

bcx 

010000 

BO 

Bl 

BD 

AA 

LK 

sc 

010001 

0 0 0 0 0 

0 0 0 0 0 

000000000000000 

1 

0 

bx 

010010 

LI 

AA 

LK 

mcrf 

010011 

crfD 

00 

crfS 

0 0 

0 0 0 0 0 

0000000000 

0 

bclrx 

010011 

BO 

Bl 

0 0 0 0 0 

000001 0000 

LK 

crnor 

010011 

crbD 

crbA 

crbB 

00001 00001 

0 

rfi 2 

010011 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 1 

10010 


0 

crandc 

010011 

crbD 

crbA 

crbB 

001 0000001 

0 

isync 

010011 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0010010110 

0 

crxor 

010011 

crbD 

crbA 

crbB 

001 1 000001 

0 

crnand 

010011 

crbD 

crbA 

crbB 

00111 00001 

0 

crand 

010011 

crbD 

crbA 

crbB 

01 00000001 

0 

creqv 

010011 

crbD 

crbA 

crbB 

01001 00001 

0 

crorc 

010011 

crbD 

crbA 

crbB 

01101 00001 

0 

cror 

010011 

crbD 

crbA 

crbB 

0111 000001 

0 

bcctrx 

010011 

BO 

Bl 

0 0 0 0 0 

1 00001 0000 

LK 

rlwimix 

010100 

S 

A 

SH 

MB 

ME 

Rc 

rlwinmx 

010101 

S 

A 

SH 

MB 

ME 

Rc 

rlwnmx 

010111 

S 

A 

B 

MB 

ME 

Rc 
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5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


0 11000 
0 11001 
0 110 10 


0 110 11 


0 11100 



mulhwux 
mfcr 
Iwarx 
Iwzx 
slwx 
cntlzwx I 


a 

c 

si 



a 

mulhwx 

mfmsr 2 

dcbf 

Ibzx 

neax 


subfex 
addex 
mtcrf 
mtmsr 2 


0 11111 


011111 

011111 

011111 

011111 

011111 


011111 


0 11111 


0 11111 
0 11111 
0 11111 
0 11111 


0 11111 


011111 

011111 

011111 

011111 

011111 


0 11111 


0 11111 


011111 

011111 

011111 

011111 


crfD 0 L 
D 

0 0 0 0 0 
D 


S 


D 


D 

0 0 0 0 0 
D 


A 


A 


0 0 0 0 0 
A 
A 
A 


A 


A 


A 

A 

A 

A 


A 


A 


0 0 0 0 0 
A 
A 
A 


A 


A 


B 


B 


0 0 0 0 0 
B 
B 
B 


0 0 0 0 0 


B 


B 

B 

B 

B 


B 


B 


0 0 0 0 0 
B 
B 

0 0 0 0 0 


B 


B 




a 


0 0 0 0 0 0 1 0 1 0 


0 0 0 0 0 0 1 0 1 1 


0 0 0 0 0 1 0 0 1 1 
0 0 0 0 0 1 0 1 0 0 
0 0 0 0 0 1 0 1 1 1 
0 0 0 0 0 1 1 0 0 0 


0 0 0 0 0 1 1 0 1 0 


0 0 0 0 0 1 1 1 0 0 


0000100000 
0 0 0 0 1 0 1 0 0 0 
0000110110 
0000110111 


0000111100 


0001 001011 


0001010011 
0001010110 
0001010111 
0001 1 01 000 


0001110111 


0001111100 




I 




D 

A 

B 

OE 

0010001000 

Rc 

D 

A 

B 

OE 

0010001010 

Rc 

S 

0 CRM 0 


0010010000 

0 

S 

0 0 0 0 0 

0 0 0 0 0 


0010010010 

0 


Appendix A. PowerPC Instruction Set Listings 


A-9 






















































.me 


0 


5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


stwcx. 

011111 

s 

A 

B 

0010010110 


stwx 

011111 

s 

A 

B 

0010010111 

0 

stwux 

011111 

s 

A 

B 

0010110111 

0 

subfzex 

011111 

D 

A 

0 0 0 0 0 

OE 

0011001000 


addzex 

011111 

D 

A 

0 0 0 0 0 

OE 

0011001010 


mtsr 2 

011111 

s 

0 

SR 

0 0 0 0 0 

0011010010 

0 

stbx 

011111 

s 

A 

B 

0011010111 

0 

subfmex 

011111 

D 

A 

0 0 0 0 0 

OE 

0011101000 

Rc 

addmex 

011111 

D 

A 

0 0 0 0 0 

OE 

0011101010 

Rc 

mullwx 

011111 

D 

A 

B 

OE 

0011101011 


mtsrin 2 

011111 

S 

0 0 0 0 0 

B 

0011110010 

0 

dcbtst 

011111 

0 0 0 0 0 

A 

B 

0011110110 

0 

stbux 

011111 

s 

A 

B 

0011110111 

0 

addx 

011111 

D 

A 

B 

OE 

0100001010 

Rc 

debt 

011111 

0 0 0 0 0 

A 

B 

0100010110 

0 

Ihzx 

011111 

D 

A 

B 

0100010111 

0 

eqvx 

011111 

S 

A 

B 

0100011100 


tlbie 12 

011111 

0 0 0 0 0 

0 0 0 0 0 

B 

0100110010 

0 

eciwx 

011111 

D 

A 

B 

0100110110 

0 

Ihzux 

011111 

D 

A 

B 

0100110111 

0 

xorx 

011111 

S 

A 

B 

0100111100 

Rc 

mfspr 2 4 

011111 

D 

spr 

0101010011 

0 

lhax 

011111 

D 

A 

B 

0101010111 

0 

tibia 12 

011111 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0101110010 

0 

mftb 

011111 

D 

tbr 

0101110011 

0 

lhaux 

011111 

D 

A 

B 

0101110111 

0 

sthx 

011111 

S 

A 

B 

0110010111 

0 

orex 

011111 

S 

A 

B 

0110011100 


ecowx 

011111 

S 

A 

B 

0110110110 

0 

sthux 

011111 

S 

A 

B 

0110110111 

0 

orx 

011111 

S 

A 

B 

0110111100 

Rc 

divwux 

011111 

D 

A 

B 

OE 

0111001011 

Rc 

mtspr 24 

011111 

S 

spr 

0111010011 

0 
























































Name 0 


5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


dcbi 2 

011111 

0 0 0 0 0 

A 

B 

0111010110 

0 

nandx 

011111 

s 

A 

B 

0111011100 

Rc 

divwx 

011111 

D 

A 

B 

OE 

0111101011 

Rc 

mcrxr 

011111 

crfD 

00 

0 0 0 0 0 

0 0 0 0 0 

1000000000 

0 

Iswx 3 

011111 

D 

A 

B 

1 00001 0101 

0 

Iwbrx 

011111 

D 

A 

B 

1 00001 0110 

0 

Ifsx 

011111 

D 

A 

B 

1000010111 

0 

srwx 

011111 

S 

A 

B 

1 0 0 0 0 1 1 0 0 0 

Rc 

tlbsync 12 

011111 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

1000110110 

0 

Ifsux 

011111 

D 

A 

B 

1000110111 

0 

mfsr 2 

011111 

D 

0 

SR 

0 0 0 0 0 

1001010011 

0 

Iswi 3 

011111 

D 

A 

NB 

1001010101 

0 

sync 

011111 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

1001010110 

0 

Ifdx 

011111 

D 

A 

B 

1001010111 

0 

Ifdux 

011111 

D 

A 

B 

1001110111 

0 

mfsrin 2 

011111 

D 

0 0 0 0 0 

B 

1010010011 

0 

stswx 3 

011111 

S 

A 

B 

1010010101 

0 

stwbrx 

011111 

S 

A 

B 

1010010110 

0 

stfsx 

011111 

S 

A 

B 

1010010111 

0 

stfsux 

011111 

S 

A 

B 

1010110111 

0 

stswi 3 

011111 

S 

A 

NB 

1011010101 

0 

stfdx 

011111 

S 

A 

B 

1011010111 

0 

dcba 1 

011111 

0 0 0 0 0 

A 

B 

1011110110 

0 

stfdux 

011111 

S 

A 

B 

1011110111 

0 

Ihbrx 

011111 

D 

A 

B 

1100010110 

0 

srawx 

011111 

S 

A 

B 

1100011000 

Rc 

srawix 

011111 

S 

A 

SH 

1100111000 

Rc 

eieio 

011111 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

1101010110 

0 

sthbrx 

011111 

s 

A 

B 

1110010110 

0 

extshx 

011111 

s 

A 

0 0 0 0 0 

1110011010 

Rc 

extsbx 

011111 

s 

A 

0 0 0 0 0 

1110111010 

Rc 

icbi 

011111 

0 0 0 0 0 

A 

B 

1111010110 

0 

stfiwx 1 

011111 

s 

A 

B 

1111010111 

0 


































































Name 0 


5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


dcbz 

011111 

0 0 0 0 0 

A 

B 

11111 

10 110 

0 

Iwz 

100000 

D 

A 

d 

Iwzu 

100001 

D 

A 

d 

Ibz 

100010 

D 

A 

d 

Ibzu 

100011 

D 

A 

d 

stw 

100100 

s 

A 

d 

stwu 

100101 

s 

A 

d 

stb 

100110 

s 

A 

d 

stbu 

100111 

s 

A 

d 

Ihz 

101000 

D 

A 

d 

Ihzu 

101001 

D 

A 

d 

lha 

101010 

D 

A 

d 

lhau 

101011 

D 

A 

d 

sth 

101100 

s 

A 

d 

sthu 

101101 

s 

A 

d 

Imw 3 

101110 

D 

A 

d 

stmw 3 

101111 

s 

A 

d 

Its 

1 1 0 0 0 0 

D 

A 

d 

Ifsu 

110001 

D 

A 

d 

ltd 

110010 

D 

A 

d 

Ifdu 

110011 

D 

A 

d 

stfs 

110100 

S 

A 

d 

stfsu 

110101 

S 

A 

d 

stfd 

110110 

S 

A 

d 

stfdu 

110111 

S 

A 

d 

fdivsx 

111011 

D 

A 

B 

0 0 0 0 0 

10010 

Rc 

fsubsx 

111011 

D 

A 

B 

0 0 0 0 0 

10 10 0 

Rc 

faddsx 

111011 

D 

A 

B 

0 0 0 0 0 

10 10 1 


fsqrtsx 1 

111011 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

10 110 


fresx 1 

111011 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

11000 

Rc 

fmulsx 

111011 

D 

A 

0 0 0 0 0 

C 

11001 

Rc 

fmsubsx 

111011 

D 

A 

B 

C 

1110 0 

Rc 

fmaddsx 

111011 

D 

A 

B 

c 

1110 1 

Rc 
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mtfsbOx 


111011 

111011 

111111 


11111 


111111 


fctiwzx 

111111 

fdivx 

111111 

fsubx 

111111 

faddx 

111111 


111111 


11111 


fmulx 

111111 

frsqrtex 1 

111111 

fmsubx 

111111 

fmaddx 

111111 

fnmsubx 

111111 

fnmaddx 

111111 


tempo 

111111 

mtfsblx 

111111 

fnegx 

111111 

merfs 

111111 


111111 


1111 


mtfsfix 

111111 

fnabsx 

111111 

fabsx 

111111 

mffsx 

111111 


crfD 0 0 


D 


D 


D 

D 

D 

D 


D 


D 


D 

D 

D 

D 


D 


D 


crfD 0 0 
crbD 
D 


A 

A 

A 


0 0 0 0 0 


0 0 0 0 0 


0 0 0 0 0 
A 
A 
A 


0 0 0 0 0 


A 


A 

0 0 0 0 0 
A 
A 


A 


A 


A 

0 0 0 0 0 
0 0 0 0 0 


crfD 0 0 crfS 0 0 


crbD 0 0 0 0 0 


D 00000 


crfD 0 0 0 0 0 0 0 

D 00000 

D 00000 

D 00000 



B 

B 

B 


B 


B 


B 

B 

B 

B 


B 


B 


0 0 0 0 0 
B 
B 
B 


B 


B 


B 

0 0 0 0 0 
B 

0 0 0 0 0 


0 0 0 0 0 


B 


IMM 0 
B 
B 

0 0 0 0 0 


B 


0000000000 


0000001 1 00 


0000001 1 1 0 


0000001 1 1 1 


0 0 0 0 0 
0 0 0 0 0 
0 0 0 0 0 


0 0 0 0 0 


c 


c 

0 0 0 0 0 
c 
c 


c 


c 


0000100000 
00001001 10 
0 0 0 0 1 0 1 0 0 0 
0001 000000 


0001000110 


0001 001000 


0 0 1 0 0 0 0 1 1 0 
0010001000 
0100001000 
1001000111 


1011000111 




10 0 10 Rc 

10 10 0 Rc 

10 10 1 Rc 


10 110 


10 111 


110 0 1 Rc 

110 10 Rc 

1110 0 Rc 

1110 1 Rc 


11110 


11111 






Optional instruction 
Supervisor-level instruction 
Load/store string/multiple instruction 
Supervisor-level and user-level instruction 
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A.3 Instructions Grouped by Func tional Categories 

Table A- 3 through Table A-30 list the PowerPC instructions grouped by function. 


Key: 


Reserved bits 


Table A-3. Integer Arithmetic Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 

22 23 24 25 26 27 28 29 30 

31 

addx 

31 

D 

A 

B 

OE 

266 

Rc 

addcx 

31 

D 

A 

B 

OE 

10 

Rc 

addex 

31 

D 

A 

B 

OE 

138 

Rc 

addi 

14 

D 

A 

SIMM 

addic 

12 

D 

A 

SIMM 

addic. 

13 

D 

A 

SIMM 

addis 

15 

D 

A 

SIMM 

addmex 

31 

D 

A 

0 0 0 0 0 

OE 

234 

Rc 

addzex 

31 

D 

A 

0 0 0 0 0 

OE 

202 

Rc 

divwx 

31 

D 

A 

B 

OE 

491 

Rc 

divwux 

31 

D 

A 

B 

OE 

459 

Rc 

mulhwx 

31 

D 

A 

B 

0 

75 

Rc 

mulhwux 

31 

D 

A 

B 

0 

11 

Rc 

mulli 

07 

D 

A 

SIMM 

mullwx 

31 

D 

A 

B 

OE 

235 

Rc 

negx 

31 

D 

A 

0 0 0 0 0 

OE 

104 

Rc 

subfx 

31 

D 

A 

B 

OE 

40 

Rc 

subfcx 

31 

D 

A 

B 

OE 

8 

Rc 

subficx 

08 

D 

A 

SIMM 

subfex 

31 

D 

A 

B 

OE 

136 

Rc 

subfmex 

31 

D 

A 

0 0 0 0 0 

OE 

232 

Rc 

subfzex 

31 

D 

A 

0 0 0 0 0 

OE 

200 

Rc 


A-14 


PowerPC Microprocessor Family: The Programming Environments 




Table A-4. Integer Compare Instructions 


Name 

0 5 

6 7 8 

9 

10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

cmp 

31 

crfD 

0 

L 

A 

B 

0000000000 

B 

cmpi 

11 



B 

A 

SIMM 

cmpl 

31 




A 

B 

32 

B 

cmpli 



B 


A 

UIMM 


Table A-5. Integer Logical Instructions 


Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


andx 

31 

S 

A 

B 

28 

Rc 

andcx 

31 

S 

A 

B 

60 

Rc 

andi. 

28 

S 

A 

UIMM 

andis. 

29 

S 

A 

UIMM 

cntlzwx 

31 

S 

A 

0 0 0 0 0 

26 

Rc 


31 

S 

A 

B 

284 

Rc 

extsbx 

31 

S 

A 

0 0 0 0 0 

954 

Rc 

extshx 

31 

S 

A 

0 0 0 0 0 

922 

Rc 

nandx 

31 

S 

A 

B 

476 

Rc 

norx 

31 

S 

A 

B 

124 

Rc 

orx 

31 

S 

A 

B 

444 

Rc 

orcx 

31 

S 

A 

B 

412 

Rc 

ori 

24 

S 

A 

UIMM 

oris 

25 

S 

A 

UIMM 

xorx 

31 

S 

A 

B 

316 

Rc 

xori 

26 

S 

A 

UIMM 

xoris 

27 

S 

A 

UIMM 


Table A-6. Integer Rotate Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 

26 27 28 29 30 

31 

rlwimix 

22 

S 

A 

SH 

MB 

ME 

Rc 

rlwinmx 

20 

S 

A 

SH 

MB 

ME 

Rc 

rlwnmx 

21 

S 

A 

SH 

MB 

ME 

Rc 
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Table A-7. Integer Shift Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

slwx 

31 

S 

A 

B 

24 

Rc 

srawx 

31 

S 

A 

B 

792 

Rc 

srawix 

31 

S 

A 

SH 

824 

Rc 

srwx 

31 

S 

A 

B 

536 

Rc 


Table A-8. Floating-Point Arithmetic Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 

26 27 28 29 30 

31 

faddx 

63 

D 

A 

B 

0 0 0 0 0 

21 

Rc 

faddsx 

59 

D 

A 

B 

0 0 0 0 0 

21 

Rc 

fdivx 

63 

D 

A 

B 

0 0 0 0 0 

18 

Rc 

fdivsx 

59 

D 

A 

B 

0 0 0 0 0 

18 

Rc 

fmulx 

63 

D 

A 

0 0 0 0 0 

c 

25 

Rc 

fmulsx 

59 

D 

A 

0 0 0 0 0 

c 

25 

Rc 

fresx 1 

59 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

24 

Rc 

frsqrtex 1 

63 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

26 

Rc 

fsubx 

63 

D 

A 

B 

0 0 0 0 0 

20 

Rc 

fsubsx 

59 

D 

A 

B 

0 0 0 0 0 

20 

Rc 

fselx 1 

63 

D 

A 

B 

c 

23 

Rc 

fsqrtx 1 

63 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

22 

Rc 

fsqrtsx 1 

59 

D 

0 0 0 0 0 

B 

0 0 0 0 0 

22 

Rc 


Note: 

1 Optional instruction 

Table A-9. Floating-Point Multiply-Add Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 

26 27 28 29 30 

31 

fmaddx 

63 

D 

A 

B 

C 

29 

Rc 

fmaddsx 

59 

D 

A 

B 

C 

29 

Rc 

fmsubx 

63 

D 

A 

B 

C 

28 

Rc 

fmsubsx 

59 

D 

A 

B 

C 

28 

Rc 

fnmaddx 

63 

D 

A 

B 

C 

31 

Rc 

fnmaddsx 

59 

D 

A 

B 

C 

31 

Rc 
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fnmsubx 

63 

D 

A 

B 

C 

30 

Rc 

fnmsubsx 

59 

D 

A 

B 

C 

30 

Rc 


Table A-10. Floating-Point Rounding and Conversion Instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


fctiwx 

63 

D 

0 0 0 0 0 

B 

14 

Rc 

fctiwzx 

63 

D 

0 0 0 0 0 

B 

15 

Rc 

frspx 

63 

D 

0 0 0 0 0 

B 

12 

Rc 


Table A-11. Floating-Point Compare Instructions 


Name 

0 5 

6 7 8 

9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

tempo 

63 

crfD 

00 

A 

B 

32 

0 

fempu 

63 

crfD 

00 

A 

B 

0 

0 


Table A-12. Floating-Point Status and Control Register Instructions 


Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


merfs 

63 

crfD 

00 

crfS 

00 

0 0 0 0 0 

64 

0 

mffsx 

63 

D 

0 0 0 0 0 

0 0 0 0 0 

583 

Rc 

mtfsbOx 

63 

crbD 

0 0 0 0 0 

0 0 0 0 0 

70 

Rc 

mtfsblx 

63 

crbD 

0 0 0 0 0 

0 0 0 0 0 

38 

Rc 

mtfsfx 

31 

0 

FM 

0 

B 

711 

Rc 

mtfsfix 

63 

crfD 

00 

0 0 0 0 0 

IMM 

0 

134 

Rc 


Table A-13. Integer Load Instructions 


Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


Ibz 

34 

D 

A 

d 

Ibzu 

35 

D 

A 

d 

Ibzux 

31 

D 

A 

B 

119 

0 

Ibzx 

31 

D 

A 

B 

87 

0 

lha 

42 

D 

A 

d 

lhau 

43 

D 

A 

d 

lhaux 

31 

D 

A 

B 

375 

0 

lhax 

31 

D 

A 

B 

343 

0 
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Ihz 

40 

D 

A 

d 

Ihzu 


D 

A 

d 

Ihzux 

31 

D 

A 

B 

311 

0 

Ihzx 

31 

D 

A 

B 

279 

0 

Iwz 

32 

D 

A 

d 

Iwzu 

33 

D 

A 

d 

Iwzux 

31 

D 

A 

B 

55 

0 

Iwzx 

31 

D 

A 

B 

23 

0 


Table A-14. Integer Store Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 

25 26 27 28 29 30 

31 

stb 

38 

S 

A 

d 

stbu 

39 

S 

A 

d 

stbux 

31 

S 

A 

B 

247 

0 

stbx 

31 

S 

A 

B 

215 

0 

sth 


S 

A 

d 

sthu 

45 

S 

A 

d 

sthux 

31 

S 

A 

B 

439 

0 

sthx 

31 

S 

A 

B 

407 

0 

stw 

36 

S 

A 

d 

stwu 

37 

S 

A 

d 

stwux 

31 

S 

A 

B 

183 

0 

stwx 

31 

S 

A 

B 

151 

0 


Table A-15. Integer Load and Store with Byte Reverse Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

Ihbrx 


D 

A 

B 

790 

0 

Iwbrx 

31 

D 

A 

B 

534 

0 

sthbrx 

31 

S 

A 

B 

918 

0 

stwbrx 

31 

S 

A 

B 

662 

0 
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Table A-16. Integer Load and Store Multiple Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

Imw 1 

46 

D 

A 

d 

stmw 1 

47 

S 

A 

d 


Note: 

1 Load/store string/multiple instruction 

Table A-17. Integer Load and Store String Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

Iswi 1 

31 

D 

A 

NB 

597 

0 

Iswx 1 

31 

D 

A 

B 

533 

0 

stswi 1 

31 

S 

A 

NB 

725 

0 

stswx 1 

31 

S 

A 

B 

661 

0 


Note: 

1 Load/store string/multiple instruction 


Table A-18. Memory Synchronization Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

eieio 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

854 

0 

isync 

19 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

150 

0 

Iwarx 

31 

D 

A 

B 

20 

0 

stwcx. 

31 

S 

A 

B 

150 

1 

sync 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

598 

0 


A 


Table A-19. Floating-Point Load Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

ltd 

50 

D 

A 

d 

Ifdu 

51 

D 

A 

d 

Ifdux 

31 

D 

A 

B 

631 

0 

Ifdx 

31 

D 

A 

B 

599 

0 

Its 

48 

D 

A 

d 

Ifsu 

49 

D 

A 

d 

Ifsux 

31 

D 

A 

B 

567 

0 
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Name 0 



Table A-20. Floating-Point Store Instructions 


5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3 




Optional instruction 


Table A-21. Floating-Point Move Instructions 


Name 0 


5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


fabsx 63 


0 0 0 0 0 


fmrx 63 


0 0 0 0 0 


fnabsx 63 


0 0 0 0 0 


Name 0 


bx 18 


bcx 1 6 


bcctrx I 1 9 


Table A-22. Branch Instructions 


5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
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PowerPC Microprocessor Family: The Programming Environments 





















Table A-23. Condition Register Logical Instructions 


Name 

0 5 

6 7 8 

9 10 

11 12 13 

14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

crand 

19 

crbD 

crbA 

crbB 

257 

0 

crandc 

19 

crbD 

crbA 

crbB 

129 

0 

creqv 

19 

crbD 

crbA 

crbB 

289 

0 

crnand 

19 

crbD 

crbA 

crbB 

225 

0 

crnor 

19 

crbD 

crbA 

crbB 

33 

0 

cror 

19 

crbD 

crbA 

crbB 

449 

0 

crorc 

19 

crbD 

crbA 

crbB 

417 

0 

crxor 

19 

crbD 

crbA 

crbB 

193 

0 

mcrf 

19 

crfD 

00 

crfS 

00 

0 0 0 0 0 

0000000000 

0 


Table A-24. System Linkage Instructions 


Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


rfi 1 

19 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

50 

0 

sc 

17 

0 0 0 0 0 

0 0 0 0 0 

000000000000000 

1 

0 


Notes: 

1 Supervisor-level instruction 

Table A-25. Trap Instructions 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


tw 

31 

TO 

A 

B 

4 

0 

twi 

03 

TO 

A 

SIMM 
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Table A-26. Processor Control Instructions 


Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


mcrxr 

31 

crfS 

00 

0 0 0 0 0 

0 0 0 0 0 

512 

0 

mfcr 

31 

D 

0 0 0 0 0 

0 0 0 0 0 

19 

0 

mfmsr 1 

31 

D 

0 0 0 0 0 

0 0 0 0 0 

83 

0 

mfspr 2 

31 

D 

spr 

339 

0 

mftb 

31 

D 

tpr 

371 

0 

mtcrf 

31 

S 

0 

CRM 

0 

144 

0 

mtmsr 1 

31 

S 

0 0 0 0 0 

0 0 0 0 0 

146 

0 

mtspr 2 

31 

D 

spr 

467 

0 


Notes: 

1 Supervisor-level instruction 

2 Supervisor- and user-level instruction 


Table A-27. Cache Management Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

dcba 1 

31 

0 0 0 0 0 

A 

B 

758 

0 

debt 

31 

0 0 0 0 0 

A 

B 

86 

0 

debi 2 

31 

0 0 0 0 0 

A 

B 

470 

0 

debst 

31 

0 0 0 0 0 

A 

B 

54 

0 

debt 

31 

0 0 0 0 0 

A 

B 

278 

0 

debtst 

31 

0 0 0 0 0 

A 

B 

246 

0 

debz 

31 

0 0 0 0 0 

A 

B 

1014 

0 

iebi 

31 

0 0 0 0 0 

A 

B 

982 

0 


Notes: 

1 Optional instruction 

2 Supervisor-level instruction 
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Table A-28. Segment Register Manipulation Instructions. 


Name 

0 5 

6 7 8 9 10 

11 

12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

mfsr 1 

31 

D 

0 

SR 

0 0 0 0 0 

595 

0 

mfsrin 1 

31 

D 

0 0 0 0 0 

B 

659 

0 

mtsr 1 

31 

S 

0 

SR 

0 0 0 0 0 

210 

0 

mtsrin 1 

31 

S 

0 0 0 0 0 

B 

242 

0 


Notes: 

1 Supervisor-level instruction 

Table A-29. Lookaside Buffer Management Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

tibia 12 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

370 

0 

tlbie 12 

31 

0 0 0 0 0 

0 0 0 0 0 

B 

306 

0 

tlbsync 1 ’ 2 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

566 

0 


Notes: 

1 Supervisor-level instruction 

2 Optional instruction 


Table A-30. External Control Instructions 


Name 

0 5 

6 7 8 9 10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

eciwx 

31 

D 

A 

B 

310 

0 

ecowx 

31 

S 

A 

B 

438 

0 
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A.4 Instructions Sorted by Form 

Table A-31 through Table A-41 list the PowerPC instructions grouped by form. 

Key: 

Reserved bits 


Table A-31. 1-Form 


OPCD 

LI 

AA 

LK 


Specific Instruction 

Name 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


bx 

18 

LI 

AA 

LK 


Table A-32. B-Form 


OPCD 

BO 

Bl 

BD 

AA 

LK 

Name 

Specific Instruction 

0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

bcx 

16 

BO 

Bl 

BD 

AA 

LK 


Table A-33. SC-Form 


OPCD 

0 0 0 0 0 

0 0 0 0 0 

000000000000000 

1 

0 

Name 

Specific Instruction 

0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

sc 

17 

0 0 0 0 0 

0 0 0 0 0 

000000000000000 

1 

0 


Table A-34. D-Form 


OPCD 

D 

A 

d 

OPCD 

D 

A 

SIMM 

OPCD 

S 

A 

d 

OPCD 

S 

A 

UIMM 

OPCD 

crfD 

0 

L 

A 

SIMM 

OPCD 

crfD 

0 

L 

A 

UIMM 

OPCD 

TO 

A 

SIMM 
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Specific Instructions 


Name 

0 5 

6 7 8 

9 

10 

11 12 13 14 15 

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

addi 


D 

A 

SIMM 

addic 

12 

D 

A 

SIMM 

addic. 

13 

D 

A 

SIMM 

addis 

15 

D 

A 

SIMM 

andi. 

28 

S 

A 

UIMM 

andis. 

29 

S 

A 

UIMM 

cmpi 



i 

D 

A 

SIMM 

cmpli 

10 




A 

UIMM 

Ibz 

34 

D 

A 

d 

Ibzu 

35 

D 

A 

d 

ltd 

50 

D 

A 

d 

Ifdu 

51 

D 

A 

d 

Ifs 

48 

D 

A 

d 

Ifsu 

49 

D 

A 

d 

lha 

42 

D 

A 

d 

lhau 

43 

D 

A 

d 

Ihz 

40 

D 

A 

d 

Ihzu 


D 

A 

d 

Imw 1 

46 

D 

A 

d 

Iwz 

32 

D 

A 

d 

Iwzu 

33 

D 

A 

d 

mulli 

7 

D 

A 

SIMM 

ori 

24 

S 

A 

UIMM 

oris 

25 

S 

A 

UIMM 

stb 

38 

S 

A 

d 

stbu 

39 

S 

A 

d 

stfd 

54 

S 

A 

d 

stfdu 

55 

S 

A 

d 

stfs 

52 

S 

A 

d 

stfsu 

53 

S 

A 

d 

sth 


S 

A 

d 

sthu 

45 

S 

A 

d 

stmw 1 

47 

S 

A 

d 
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stw 

36 

S 

A 

d 

stwu 

37 

S 

A 

d 

subfic 

08 

D 

A 

SIMM 

twi 


TO 

A 

SIMM 

xori 

26 

S 

A 


xoris 

27 

S 

A 



Note: 

1 Load/store string/multiple instruction 


Table A-35. X-Form 


OPCD 

D 

A 

B 

XO 

0 

OPCD 

D 

A 

NB 

XO 

0 

OPCD 

D 

0 0 0 0 0 

B 

XO 

0 

OPCD 

D 

0 0 0 0 0 

0 0 0 0 0 

XO 

0 

OPCD 

D 

D 

SR 

0 0 0 0 0 

XO 

0 

OPCD 

S 

A 

B 

XO 


OPCD 

S 

A 

B 

XO 

1 

OPCD 

S 

A 

B 

XO 

0 

OPCD 

S 

A 

NB 

XO 

0 

OPCD 

S 

A 

0 0 0 0 0 

XO 


OPCD 

S 

0 0 0 0 0 

B 

XO 

0 

OPCD 

S 

0 0 0 0 0 

0 0 0 0 0 

XO 

0 

OPCD 

S 

0 

SR 

0 0 0 0 0 

XO 

0 

OPCD 

S 

A 

SH 

XO 


OPCD 

crfD 

0 L 

A 

B 

XO 

0 

OPCD 

crfD 

00 

A 

B 

XO 

0 

OPCD 




0 0 0 0 0 

XO 

0 

OPCD 

crfD 

00 

0 0 0 0 0 

0 0 0 0 0 

XO 

0 

OPCD 

crfD 

00 

0 0 0 0 0 

IMM 0 

XO 


OPCD 

TO 

A 

B 

XO 

0 

OPCD 

D 

0 0 0 0 0 

B 

XO 


OPCD 

D 

0 0 0 0 0 

0 0 0 0 0 

XO 


OPCD 

crbD 

0 0 0 0 0 

0 0 0 0 0 

XO 



0 0 0 0 0 

A 

B 

XO 

0 


0 0 0 0 0 

0 0 0 0 0 

B 

XO 

0 


0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

XO 

0 


Specific Instructions 


Name 

0 5 

6 7 8 

9 

10 

11 12 13 14 15 

16 17 18 19 20 

21 22 23 24 25 26 27 28 29 30 

31 

andx 

31 

S 

A 

B 

28 

Rc 

andcx 

31 

S 

A 

B 

60 

Rc 

cmp 

31 

crfD 

0 

L 

A 

B 

0 

0 

cmpl 

31 

crfD 

0 

L 

A 

B 

32 

0 

cntlzwx 

31 

S 

A 

0 0 0 0 0 

26 

Rc 


A-26 


PowerPC Microprocessor Family: The Programming Environments 



























































dcba 1 

31 

0 0 0 0 0 

A 

B 

758 

0 

debt 

31 

0 0 0 0 0 

A 

B 

86 

0 

debi 2 

31 

0 0 0 0 0 

A 

B 

470 

0 

debst 

31 

0 0 0 0 0 

A 

B 

54 

0 

debt 

31 

0 0 0 0 0 

A 

B 

278 

0 

debtst 

31 

0 0 0 0 0 

A 

B 

246 

0 

debz 

31 

0 0 0 0 0 

A 

B 

1014 

0 

eciwx 

31 

D 

A 

B 

310 

0 

ecowx 

31 

S 

A 

B 

438 

0 

eieio 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

854 

0 

eqvx 

31 

s 

A 

B 

284 

Re 

extsbx 

31 

s 

A 

0 0 0 0 0 

954 

Re 

extshx 

31 

s 

A 

0 0 0 0 0 

922 

Re 

tabs* 

63 

D 

0 0 0 0 0 

B 

264 

Re 

tempo 

63 

erfD 

00 

A 

B 

32 

0 

fempu 

63 

erfD 

00 

A 

B 

0 

0 

fetiwx 

63 

D 

0 0 0 0 0 

B 

14 

Re 

fctiwzx 

63 

D 

0 0 0 0 0 

B 

15 

Re 

fmrx 

63 

D 

0 0 0 0 0 

B 

72 

Re 

fnabsx 

63 

D 

0 0 0 0 0 

B 

136 

Re 

fnegx 

63 

D 

0 0 0 0 0 

B 

40 

Re 

frspx 

63 

D 

0 0 0 0 0 

B 

12 

Re 

iebi 

31 

0 0 0 0 0 

A 

B 

982 

0 

Ibzux 

31 

D 

A 

B 

119 

0 

Ibzx 

31 

D 

A 

B 

87 

0 

Ifdux 

31 

D 

A 

B 

631 

0 

Ifdx 

31 

D 

A 

B 

599 

0 

Ifsux 

31 

D 

A 

B 

567 

0 

Ifsx 

31 

D 

A 

B 

535 

0 

lhaux 

31 

D 

A 

B 

375 

0 

lhax 

31 

D 

A 

B 

343 

0 

Ihbrx 

31 

D 

A 

B 

790 

0 

Ihzux 

31 

D 

A 

B 

311 

0 

Ihzx 

31 

D 

A 

B 

279 

0 

Iswi 3 

31 

D 

A 

NB 

597 

0 
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Iswx 3 

31 

D 

A 

B 

533 

0 

Iwarx 

31 

D 

A 

B 

20 

0 

Iwbrx 

31 

D 

A 

B 

534 

0 

Iwzux 

31 

D 

A 

B 

55 

0 

Iwzx 

31 

D 

A 

B 

23 

0 

mcrfs 

63 

crfD 

00 

crfS 

00 

0 0 0 0 0 

64 

0 

mcrxr 

31 

crfD 

00 

0 0 0 0 0 

0 0 0 0 0 

512 

0 

mfcr 

31 

D 

0 0 0 0 0 

0 0 0 0 0 

19 

0 

mffsx 

63 

D 

0 0 0 0 0 

0 0 0 0 0 

583 

Rc 

mfmsr 2 

31 

D 

0 0 0 0 0 

0 0 0 0 0 

83 

0 

mfsr 2 

31 

D 

0 

SR 

0 0 0 0 0 

595 

0 

mfsrin 2 

31 

D 

0 0 0 0 0 

B 

659 

0 

mtfsbOx 

63 

crbD 

0 0 0 0 0 

0 0 0 0 0 

70 

Rc 

mtfsblx 

63 

crfD 

0 0 0 0 0 

0 0 0 0 0 

38 

Rc 

mtfsfix 

63 

crbD 

00 

0 0 0 0 0 


D 

134 

Rc 

mtmsr 2 

31 

S 

0 0 0 0 0 

0 0 0 0 0 

146 

0 

mtsr 2 

31 

S 

0 

SR 

0 0 0 0 0 

210 

0 

mtsrin 2 

31 

S 

0 0 0 0 0 

B 

242 

0 

nandx 

31 

S 

A 

B 

476 

Rc 

norx 

31 

S 

A 

B 

124 

Rc 

orx 

31 

S 

A 

B 

444 

Rc 

orcx 

31 

S 

A 

B 

412 

Rc 

slwx 

31 

S 

A 

B 

24 

Rc 

srawx 

31 

S 

A 

B 

792 

Rc 

srawix 

31 

S 

A 

SH 

824 

Rc 

srwx 

31 

S 

A 

B 

536 

Rc 

stbux 

31 

S 

A 

B 

247 

0 

stbx 

31 

S 

A 

B 

215 

0 

stfdux 

31 

S 

A 

B 

759 

0 

stfdx 

31 

S 

A 

B 

727 

0 

stfiwx 1 

31 

S 

A 

B 

983 

0 

stfsux 

31 

S 

A 

B 

695 

0 

stfsx 

31 

S 

A 

B 

663 

0 

sthbrx 

31 

S 

A 

B 

918 

0 

sthux 

31 

S 

A 

B 

439 

0 
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sthx 

31 

S 

A 

B 

407 

0 

stswi 3 

31 

S 

A 

NB 

725 

0 

stswx 3 

31 

S 

A 

B 

661 

0 

stwbrx 

31 

S 

A 

B 

662 

0 

stwcx. 

31 

S 

A 

B 

150 

1 

stwux 

31 

S 

A 

B 

183 

0 

stwx 

31 

S 

A 

B 

151 

0 

sync 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

598 

0 

tibia 12 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

370 

0 

tlbie 12 

31 

0 0 0 0 0 

0 0 0 0 0 

B 

306 

0 

tlbsync 12 

31 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

566 

0 

tw 

31 

TO 

A 

B 

4 

0 

xorx 

31 

S 

A 

B 

316 

Rc 


Notes: 

1 Optional instruction 

2 Supervisor-level instruction 

3 Load/store string/multiple instruction 


Appendix A. PowerPC Instruction Set Listings 


A-29 






















A.5 Instruction Set Legend 

Table A-36 provides general information on the PowerPC instruction set (such as the 
architectural level, privilege level, and form) 


Table A-36. PowerPC Instruction Set Legend 



UISA 

VEA 

OEA 

Supervisor 

Level 

Optional 

Form 

addx 

V 





XO 

addcx 

V 





xo 

addex 

V 





XO 

addi 

V 





D 

addic 

V 





D 

addic. 

V 





D 

addis 

V 





D 

addmex 

V 





XO 

addzex 

V 





XO 

andx 

V 





X 

andcx 

V 





X 

andi. 

V 





D 

andis. 

V 





D 

bx 

V 





1 

bcx 

V 





B 

bcctrx 

V 





XL 

bclrx 

V 





XL 

cmp 

V 





X 

cmpi 

V 





D 

cmpl 

V 





X 

cmpli 

V 





D 

cntlzwx 

V 





X 

crand 

V 





XL 

crandc 

V 





XL 

creqv 

V 





XL 

crnand 

V 





XL 

cmor 

V 





XL 

cror 

V 





XL 

crorc 

V 





XL 


A-30 


PowerPC Microprocessor Family: The Programming Environments 
































































































Table A-36. PowerPC Instruction Set Legend (Continued) 




















































































































Table A-36. PowerPC Instruction Set Legend (Continued) 


Optional 


fnegx 

V 

fnmaddx 

V 

fnmaddsx 

V 

fnmsubx 

V 

fnmsubsx 

V 

fresx 

V 

frspx 

V 

frsqrtex 

V 

fselx 

V 

fsqrtx 

V 

fsqrtsx 

V 

fsubx 

V 

fsubsx 

V 

icbi 


isync 


Ibz 

V 

Ibzu 

V 

Ibzux 

V 

Ibzx 

V 

Ifd 

V 

Ifdu 

V 

Ifdux 

V 

Ifdx 

V 

Its 

V 

Ifsu 

V 

Ifsux 

V 

Ifsx 

V 

lha 

V 

lhau 

V 

lhaux 

V 

lhax 

V 

Ihbrx 

V 

Ihz 

V 



















































































































Table A-36. PowerPC Instruction Set Legend (Continued) 



mftb V 

mtcrf V 



mtfsbOx 

V 

mtfsblx 

V 



mtfsfx 

V 

mtfsfix 

V 



mtmsr 

mtspr 1 V 

mtsr 

mtsrin 



mulhwx 

V 

mulhwux 

V 


























































































































Table A-36. PowerPC Instruction Set Legend (Continued) 
















































































































Table A-36. PowerPC Instruction Set Legend (Continued) 



stswx 2 


stw 


stwbrx 


stwcx. 


stwu 


stwux 


stwx 


subfx 


subfcx 


subfex 


subfic 


subfmex 


subfzex 


sync 


tlbiax 


tlbiex 


tlbsync 


tw 


twi 


xorx 


xori 


xoris 



V 

V 

V 

V 

V 

V 




A 


1 Supervisor- and user-level instruction 

2 Load/store string or multiple instruction 
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Appendix B. POWER Architecture 
Cross Reference 


This appendix identifies the incompatibilities that must be managed in migration from the 
POWER architecture to PowerPC architecture. Some of the incompatibilities can, at least 
in principle, be detected by the processor, which traps and lets software simulate the 
POWER operation. Others cannot be detected by the processor. 

In general, the incompatibilities identified here are those that affect a POWER application 
program. Incompatibilities for instructions that can be used only by POWER system 
programs are not discussed. 

NOTE: This appendix describes incompatibilities with respect to the PowerPC 
architecture in general. 

B.1 New Instructions, Formerly Supervisor-Level 
Instructions 

Instructions new to PowerPC typically use opcode values (including extended opcodes) 
which are illegal in the POWER architecture. A few instructions that are supervisor-level 
in the POWER architecture (for example, dclz, called dcbz in the PowerPC architecture) 
have been made user- level in the PowerPC architecture. 

Any POWER program that executes one of these now-valid, or now-user-level, instructions 
expecting to cause the system illegal instruction error handler (program exception), or the 
system supervisor-level instruction error handler to be invoked, will not execute correctly 
on PowerPC processors. 

NOTE: In the architecture specification, user- and supervisor-level are referred to as 
problem and privileged state, respectively, and exceptions are referred to as 
interrupts. 

B.2 New Supervisor-Level Instructions 

The following instructions are user-level in the POWER architecture but are supervisor- 
level in PowerPC processors. 

• mfmsr 

• mfsr 


Appendix B. POWER Architecture Cross Reference 


B-1 



B.3 Reserved Bits in Instructions 

These are shown as zeros and the bit field is shaded in the instruction opcode definitions. 
In the POWER architecture such bits are ignored by the processor. In the PowerPC 
architecture they must be zero or the instruction form is invalid. In several cases, the 
PowerPC architecture assumes that such bits in POWER instructions are indeed zero. The 
cases include the following: 

• cmpi, cmp, cmpli, and cmpl assume that bit 10 in the POWER instructions is 0. 

• mtspr and mfspr assume that bits 16-20 in the POWER instructions are 0. 

B.4 Reserved Bits in Registers 

The POWER architecture defines these bits to be zero when read, and either zero or one 
when written to. In the PowerPC architecture it is implementation-dependent for each 
register, whether these bits are zero when read, and ignored when written to, or are copied 
from source to destination when read or written to. 

B.5 Alignment Check 

The AL bit in the POWER machine state register, MSR[24], is not supported in the 
PowerPC architecture. The bit is reserved in the PowerPC architecture. The low-order bits 
of the EA are always used. Notice that value zero — the normal value for a reserved SPR 
bit — means ignore the low-order EA bits in the POWER architecture, and value one means 
use the low-order EA bits. However, MSR[24] is not assigned new meaning in the PowerPC 
architecture. 

B.6 Condition Register 

The following instructions specify a field in the condition register (CR) explicitly (via the 
crfD field) and also have the record bit (Re) option. In the PowerPC architecture, if Re = 1 
for these instructions the instruction form is invalid. In the POWER architecture, if Re = 1 
the instructions execute normally except as shown in Table B-l. 


Table B-1. Condition Register Settings 


Instruction 

Setting 

cmp 

CRO is undefined if Rc = 1 and crfD^O 

cmpl 

CRO is undefined if Rc = 1 and crfD^O 

mcrxr 

CRO is undefined if Rc = 1 and crfD^O 

fcmpu 

CR1 is undefined if Rc = 1 

tempo 

CR1 is undefined if Rc = 1 

merfs 

CR1 is undefined if Rc = 1 and crfD^I 
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B.7 Inappropriate Use of LK and Rc bits 

For the instructions listed below, if LK = 1 or Rc = 1, POWER processors execute the 
instruction normally with the exception of setting the link register (if LK = 1) or the CRO 
or CR1 fields (if Rc = 1) to an undefined value. In the PowerPC architecture, such 
instruction forms are invalid. 

The PowerPC instruction form is invalid if LK = 1 : 

• sc (svcx in the POWER architecture) 

• Condition register logical instructions (that is, crand, crandc, creqv, crnand, 
crnor, cror, crorc, and crxor) 

• mcrf 

• isync (ics in the POWER architecture) 

The PowerPC instruction form is invalid if Rc = 1 : 

• Integer X-form load and store instructions: 

— X-form load instructions — lbzux, lbzx, ldarx, ldux, ldx, lhaux, lhax, lhbrx, 
lhzux, lhzx, lswi, lswx, lwarx, lwaux, lwax, lwbrx, lwzux, lwzx 

— X-form store instructions — stbux, stbx, stdcx., stdux, stdx, sthbrx, sthux, 
sthx, stswi, stswx, stwbrx, stwcx., stwux, stwx 

• Integer X-form compare instructions (that is, cmp, cmpl) 

• X-form trap instruction (that is, td) 

• mtspr, mfspr, mtcrf, mcrxr, mfcr 

• Floating-point X-form load and store instructions and floating-point compare 
instructions 

— Floating-point X-form load instructions — lfdux, lfdx, lfsux, lfsx 
— Floating-point X-form store instructions — stfdux, stfdx, stfiwx, stfsux, stfsx 
— Floating-point X-form compare instruction — fcmpo, fcmpu 

• mcrfs 

• dcbz (dclz in the POWER architecture) 

B.8 BO Field 

The POWER architecture shows certain bits in the BO field — used by branch conditional 
instructions — as x without indicating how these bits are to be interpreted. These bits are 
ignored by POWER processors. 

The PowerPC architecture shows these bits as either z or y. The z bits are ignored, as in 
POWER. However, the y bit need not be ignored, but rather can be used to give a hint about 
whether the branch is likely to be taken. If a POWER program has the incorrect value for 
this bit, the program will run correctly but performance may suffer. 


Appendix B. POWER Architecture Cross Reference 


B-3 



B.9 Branch Conditional to Count Register 

For the case in which the count register is decremented and tested (that is, the case in which 
BO[2] = 0), the POWER architecture specifies only that the branch target address is 
undefined, implying that the count register, and the link register (if LK =1), are updated in 
the normal way. The PowerPC architecture considers this instruction form invalid. 

B.10 System Call/Supervisor Call 

The System Call (sc) instruction in the PowerPC architecture is called Supervisor Call 
(svcx) in the POWER architecture. Differences in implementations are as follows: 

• The POWER architecture provides a version of the svcv instruction (bit 30 = 0) that 
allows instruction fetching to continue at any one of 128 locations. It is used for “fast 
Supervisor Calls.” The PowerPC architecture provides no such version. If bit 30 of 
the instruction is zero the instruction form is invalid. 

• The POWER architecture provides a version of the svcv instruction 

(bits 30-31 = Obi 1) that resumes instruction fetching at one location and sets the 
link register (LR) to the address of the next instruction. The PowerPC architecture 
provides no such version; if Re = 1, the instruction form is invalid. 

• For the POWER architecture, information from the MSR is saved in the count 
register (CTR). For the PowerPC architecture, this information is saved in the 
machine status save/restore register 1 (SRR1). 

• The POWER architecture permits bits 16-29 of the instruction to be nonzero, while 
in the PowerPC architecture, such an instruction form is invalid. 

• The POWER architecture saves the low-order 16 bits of the svcx instruction in the 
CTR; the PowerPC architecture does not save them. 

• The settings of the MSR bits by the system call exception differ between the 
POWER architecture and the PowerPC architecture. 

B.11 XER Register 

Bits 16-23 of the XER are reserved in the PowerPC architecture, whereas in the POWER 
architecture they are defined to contain the comparison byte for the lscbx instruction, which 
is not included in the PowerPC architecture. 

B.12 Update Forms of Memory Access 

The PowerPC architecture requires that rA not be equal to either rD (integer load only) or 
zero. If the restriction is violated, the instruction form is invalid. See Section 4.1.3, “Classes 
of Instructions,” for information about invalid instructions. The POWER architecture 
permits these cases and simply avoids saving the EA. 
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B.13 Multiple Register Loads 

When executing instructions that load multiple registers, the PowerPC architecture requires 
that rA, and rB if present in the instruction format, not be in the range of registers to be 
loaded, while the POWER architecture permits this and does not alter rA or rB in this case. 
(The PowerPC architecture restriction applies even if rA = 0, although there is no obvious 
benefit to the restriction in this case since rA is not used to compute the effective address 
if rA = 0.) If the PowerPC architecture restriction is violated, either the system illegal 
instruction error handler is invoked or the results are boundedly undefined. 

The instructions affected are listed as follows: 

• lmw (lm in the POWER architecture) 

• lswi (lsi in the POWER architecture) 

• lswx (lsx in the POWER architecture) 

For example, an lmw instruction that loads all 32 registers is valid in the POWER 
architecture but is an invalid form in the PowerPC architecture. 

B.14 Alignment for Load/Store Multiple 

When executing load/store multiple instructions, the PowerPC architecture requires the EA 
to be word-aligned and yields an alignment exception or boundedly-undefined results if it 
is not. The POWER architecture specifies that an alignment exception occurs (if AL =1). 

B.15 Load and Store String Instructions 

In the PowerPC architecture, an lswx instruction with zero length leaves the content of rD 
undefined (if rD^rA and rD^rB) or is an invalid instruction form (if rD = rA or 
rD = rB), while in the POWER architecture the corresponding instruction (lsx) is a no-op 
in these cases. 

Note also that, in the PowerPC architecture, an lswx instruction with zero length may alter 
the referenced bit, and an stswx instruction with zero length may alter the referenced and 
changed bits, while in the POWER architecture the corresponding instructions (lsx and 
stsx) do not alter the referenced and changed bits. 

B.16 Synchronization 

The sync instruction (called dcs in the POWER architecture) and the isync instruction 
(called the ics in the POWER architecture) cause a much more pervasive synchronization 
in the PowerPC architecture than in the POWER architecture. For more information, refer 
to Chapter 8, “Instruction Set.” 
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B.17 Move to/from SPR 

Differences in how the Move to/from Special Purpose Register (mtspr and mfspr) 
instructions function are as follows: 

• The SPR field is 10 bits long in the PowerPC architecture, but only 5 bits in POWER 
architecture. 

• The mfspr instruction can be used to read the decrementer (DEC) register in 
problem state (user mode) in the POWER architecture, but only in supervisor state 
in the PowerPC architecture. 

• If the SPR value specified in the instruction is not one of the defined values, the 
POWER architecture behaves as follows: 

— If the instruction is executed in user-level privilege state and SPR[0] = 1 , a 
supervisor-level instruction type program exception occurs. No architected 
registers are altered except those set by the exception. 

— If the instruction is executed in supervisor-level privilege state and SPR[0] = 0, 
no architected registers are altered. 

In this same case, the PowerPC architecture behaves as follows: 

— If the instruction is executed in user-level privilege state and SPR[0] = 1 , either 
an illegal instruction type program exception or a supervisor-level instruction 
type program exception occurs. No architected registers are altered except those 
set by the exception. 

— Otherwise, (the instruction is executed in supervisor-level privilege state or 
SPR[0] = 0), either an illegal instruction type program exception occurs (in 
which case no architected registers are altered except those set by the exception) 
or the results are boundedly undefined. 

B.18 Effects of Exceptions on FPSCR Bits FR and FI 

For the following cases, the POWER architecture does not specify how the FR and FI bits 
are set, while the PowerPC architecture preserves them for illegal operation exceptions 
caused by compare instructions and clears them otherwise. 

• Invalid operation exception (enabled or disabled) 

• Zero divide exception (enabled or disabled) 

• Disabled overflow exception 
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B.19 Floating-Point Store Single Instructions 

There are several respects in which the PowerPC architecture is incompatible with the 
POWER architecture when executing store floating-point single instructions. 

The POWER architecture uses FPSCR[UE] to help determine whether denormalization 
should be done, while the PowerPC architecture does not. 

NOTE: In the PowerPC architecture, if FPSCR[UE] = 1 and a denormalized single- 
precision number is copied from one memory location to another by means of an 
lfs instruction followed by an stfs instruction, the two “copies” may not be the 
same. Refer to Section 3. 3. 6. 2. 2, “Underflow Exception Condition,” for more 
information about underflow exceptions. 

For an operand having an exponent that is less than 874 (an unbiased exponent less than - 
149), the POWER architecture specifies storage of a zero (if FPSCR[UE] = 0), while the 
PowerPC architecture specifies the storage of an undefined value. 

B.20 Move from FPSCR 

The POWER architecture defines the high-order 32 bits of the result of mffs to be 
OxFFFF_FFFF. In the PowerPC architecture they are undefined. 

B.21 Clearing Bytes in the Data Cache 

The dclz instruction of the POWER architecture and the dcbz instruction of the PowerPC 
architecture have the same opcode. However, the functions differ in the following respects. 

• The dclz instruction clears a line; dcbz clears a block. 

• The dclz instruction saves the EA in rA (if rA#0); dcbz does not. 

• The dclz instruction is supervisor-level; dcbz is not. 

B.22 Segment Register Instructions 

The definitions of the four segment register instructions (mtsr, mtsrin, mfsr, and mfsrin) 
differ in two respects between the POWER architecture and the PowerPC architecture. 
Instructions similar to mtsrin and mfsrin are called mtsri and mfsri in the POWER 
architecture. The definitions follow; 

• Privilege — mfsr and mfsri are problem state instructions in the POWER 
architecture, while mfsr and mfsrin are supervisor-level in the PowerPC 
architecture. 

• Function — the indirect instructions (mtsri and mfsri) in the POWER architecture 
use an rA register in computing the segment register number, and the computed EA 
is stored into rA (if rA^O and rA^rD); in the PowerPC architecture mtsrin and 
mfsrin have no rA field and EA is not stored. 
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The mtsr, mtsrin (mtsri), and mfsr instructions have the same opcodes in the PowerPC 
architecture as in the POWER architecture. The mfsri instruction in the POWER 
architecture and the mfsrin instruction in PowerPC architecture have different opcodes. 

B.23 TLB Entry Invalidation 

The tlbi instruction in the POWER architecture and the tlbie instruction in the PowerPC 
architecture have the same opcode. However, the functions differ in the following respects. 

• The tlbi instruction computes the EA as (rAIO) + rB, while tlbie lacks an rA field 
and computes the EA as rB. 

• The tlbi instruction saves the EA in rA (if rAA)); tlbie lacks an rA field and does 
not save the EA. 

B.24 Floating-Point Exceptions 

Both the PowerPC and the POWER architectures use bit 20 of the MSR to control the 
generation of exceptions for floating-point enabled exceptions. However, in the PowerPC 
architecture this bit is part of a 2-bit value which controls the occurrence, precision, and 
recoverability of the exception, whereas, in the POWER architecture this bit is used 
independently to control the occurrence of the exception (in the POWER architecture all 
floating-point exceptions are precise). 

B.25 Timing Facilities 

This section describes differences between the POWER architecture and the PowerPC 
architecture timer facilities. 

B.25.1 Real-Time Clock 

The POWER real-time clock (RTC) is not supported in the PowerPC architecture. Instead, 
the PowerPC architecture provides a time base register (TB). Both the RTC and the TB are 
64-bit special-purpose registers, but they differ in the following respects: 

• The RTC counts seconds and nanoseconds, while the TB counts ticks. The 
frequency of the TB is implementation-dependent. 

• The RTC increments discontinuously — 1 is added to RTCU when the value in RTCL 
passes 999_999_999. The TB increments continuously — 1 is added to TBU when 
the value in TBL passes OxFFFF_FFFF. 

• The RTC is written and read by the mtspr and mfspr instructions, using SPR 
numbers that denote the RTCU and RTCD. The TB is written by the mtspr 
instruction (using new SPR numbers) and read by the new mftb instruction. 

• The SPR numbers that denote POWER architectures’s RTCL and RTCU are invalid 
in the PowerPC architecture. 
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• The RTC is guaranteed to increment at least once in the time required to execute ten 
Add Immediate (addi) instructions. No analogous guarantee is made for the TB. 

• Not all bits of RTCL need be implemented, while all bits of the TB must be 
implemented. 

B.25.2 Decrementer 

The decrementer (DEC) register differs, in the PowerPC and POWER architectures, in the 
following respects: 

• The PowerPC architecture DEC register decrements at the same rate that the TB 
increments, while the POWER decrementer decrements every nanosecond (which is 
the same rate that the RTC increments). 

• Not all bits of the POWER DEC need be implemented, while all bits of the PowerPC 
DEC must be implemented. 

• The exception caused by the DEC has its own exception vector location in the 
PowerPC architecture, but is considered an external exception in the POWER 
architecture. 

B.26 Deleted Instructions 

The following instructions, shown in Table B-2, are part of the POWER architecture but 
have been dropped from the PowerPC architecture. 


Table B-2. Deleted POWER Instructions 


Mnemonic 

Instruction 

Primary 

Opcode 

Extended 

Opcode 

abs 

Absolute 

31 

360 

clcs 

Cache Line Compute Size 

31 

531 

elf 

Cache Line Flush 

31 

118 

cli 

Cache Line Invalidate 

31 

502 

deist 

Data Cache Line Store 

31 

630 

div 

Divide 

31 

331 

divs 

Divide Short 

31 

363 

doz 

Difference or Zero 

31 

264 

dozi 

Difference or Zero Immediate 

09 

— 

Iscbx 

Load String and Compare Byte Indexed 

31 

277 

maskg 

Mask Generate 

31 

29 

maskir 

Mask Insert from Register 

31 

541 

mfsrin 

Move from Segment Register Indirect 

31 

627 

mul 

Multiply 

31 

107 
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Table B-2. Deleted POWER Instructions (Continued) 


Mnemonic 

Instruction 

Primary 

Opcode 

Extended 

Opcode 

nabs 

Negative Absolute 

31 

488 

rac 

Real Address Compute 

31 

818 

rlmi 

Rotate Left then Mask Insert 

22 

— 

rrib 

Rotate Right and Insert Bit 

31 

537 

sle 

Shift Left Extended 

31 

153 

sleq 

Shift Left Extended with MQ 

31 

217 

sliq 

Shift Left Immediate with MQ 

31 

184 

slliq 

Shift Left Long Immediate with MQ 

31 

248 

sliq 

Shift Left Long with MQ 

31 

216 

slq 

Shift Left with MQ 

31 

152 

sraiq 

Shift Right Algebraic Immediate with MQ 

31 

952 

sraq 

Shift Right Algebraic with MQ 

31 

920 

sre 

Shift Right Extended 

31 

665 

srea 

Shift Right Extended Algebraic 

31 

921 

sreq 

Shift Right Extended with MQ 

31 

729 

sriq 

Shift Right Immediate with MQ 

31 

696 

srliq 

Shift Right Long Immediate with MQ 

31 

760 

srlq 

Shift Right Long with MQ 

31 

728 

srq 

Shift Right with MQ 

31 

664 


Note: Many of these instructions use the MQ register. The MQ is not defined in the 
PowerPC architecture. 
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B.27 POWER Instructions Supported by the PowerPC 
Architecture 

Table B-3 lists the POWER instructions implemented in the PowerPC architecture. 


Table B-3. POWER Instructions Implemented in PowerPC Architecture 


POWER 

PowerPC 

Mnemonic 

Instruction 

Mnemonic 

Instruction 

ax 

Add 

addcx 

Add Carrying 

aex 

Add Extended 

addex 

Add Extended 

ai 

Add Immediate 

addic 

Add Immediate Carrying 

ai. 

Add Immediate and Record 

addic. 

Add Immediate Carrying and Record 

amex 

Add to Minus One Extended 

addmex 

Add to Minus One Extended 

andil. 

AND Immediate Lower 

andi. 

AND Immediate 

andiu. 

AND Immediate Upper 

andis. 

AND Immediate Shifted 

azex 

Add to Zero Extended 

addzex 

Add to Zero Extended 

bccx 

Branch Conditional to Count Register 

bcctrx 

Branch Conditional to Count Register 

bcrx 

Branch Conditional to Link Register 

bclrx 

Branch Conditional to Link Register 

cal 

Compute Address Lower 

addi 

Add Immediate 

cau 

Compute Address Upper 

addis 

Add Immediate Shifted 

caxx 

Compute Address 

addx 

Add 

cntlzx 

Count Leading Zeros 

cntlzwx 

Count Leading Zeros Word 

dclz 

Data Cache Line Set to Zero 

dcbz 

Data Cache Block Set to Zero 

dcs 

Data Cache Synchronize 

sync 

Synchronize 

extsx 

Extend Sign 

extshx 

Extend Sign Half Word 

fax 

Floating Add 

faddx 

Floating Add 

fdx 

Floating Divide 

fdivx 

Floating Divide 

fmx 

Floating Multiply 

fmulx 

Floating Multiply 

fmax 

Floating Multiply-Add 

fmaddx 

Floating Multiply-Add 

fmsx 

Floating Multiply-Subtract 

fmsubx 

Floating Multiply-Subtract 

fnmax 

Floating Negative Multiply-Add 

fnmaddx 

Floating Negative Multiply-Add 

fnmsx 

Floating Negative Multiply-Subtract 

fnmsubx 

Floating Negative Multiply-Subtract 

fsx 

Floating Subtract 

fsubx 

Floating Subtract 

ics 

Instruction Cache Synchronize 

isync 

Instruction Synchronize 

1 

Load 

Iwz 

Load Word and Zero 

Ibrx 

Load Byte- Reverse Indexed 

Iwbrx 

Load Word Byte-Reverse Indexed 


























































































































Table B-3. POWER Instructions Implemented in PowerPC Architecture (Continued) 


POWER 

PowerPC 

Mnemonic 

Instruction 

Mnemonic 

Instruction 

Im 

Load Multiple 

Imw 

Load Multiple Word 

Isi 

Load String Immediate 

Iswi 

Load String Word Immediate 

Isx 

Load String Indexed 

Iswx 

Load String Word Indexed 

lu 

Load with Update 

Iwzu 

Load Word and Zero with Update 

lux 

Load with Update Indexed 

Iwzux 

Load Word and Zero with Update 

Indexed 

lx 

Load Indexed 

Iwzx 

Load Word and Zero Indexed 

mtsri 

Move to Segment Register Indirect 

mtsrin 

Move to Segment Register Indirect * 

muli 

Multiply Immediate 

mulli 

Multiply Low Immediate 

mulsx 

Multiply Short 

mullwx 

Multiply Low 

oril 

OR Immediate Lower 

ori 

OR Immediate 

oriu 

OR Immediate Upper 

oris 

OR Immediate Shifted 

rlimix 

Rotate Left Immediate then Mask 

Insert 

rlwimix 

Rotate Left Word Immediate then Mask 
Insert 

rlinmx 

Rotate Left Immediate then AND With 
Mask 

rlwinmx 

Rotate Left Word Immediate then AND 
with Mask 

rlnmx 

Rotate Left then AND with Mask 

rlwnmx 

Rotate Left Word then AND with Mask 

sfx 

Subtract from 

subfcx 

Subtract from Carrying 

sfex 

Subtract from Extended 

subfex 

Subtract from Extended 

sfi 

Subtract from Immediate 

subfic 

Subtract from Immediate Carrying 

sfmex 

Subtract from Minus One Extended 

subfmex 

Subtract from Minus One Extended 

sfzex 

Subtract from Zero Extended 

subfzex 

Subtract from Zero Extended 

six 

Shift Left 

slwx 

Shift Left Word 

srx 

Shift Right 

srwx 

Shift Right Word 

srax 

Shift Right Algebraic 

srawx 

Shift Right Algebraic Word 

sraix 

Shift Right Algebraic Immediate 

srawix 

Shift Right Algebraic Word Immediate 

St 

Store 

stw 

Store Word 

stbrx 

Store Byte- Reverse Indexed 

stwbrx 

Store Word Byte-Reverse Indexed 

stm 

Store Multiple 

stmw 

Store Multiple Word 

stsi 

Store String Immediate 

stswi 

Store String Word Immediate 

stsx 

Store String Indexed 

stswx 

Store String Word Indexed 

stu 

Store with Update 

stwu 

Store Word with Update 































































































































Table B-3. POWER Instructions Implemented in PowerPC Architecture (Continued) 


POWER 

PowerPC 

Mnemonic 

Instruction 

Mnemonic 

Instruction 

stux 

Store with Update Indexed 

stwux 

Store Word with Update Indexed 

stx 

Store Indexed 

stwx 

Store Word Indexed 

svca 

Supervisor Call 

sc 

System Call 

t 

Trap 

tw 

Trap Word 

ti 

Trap Immediate 

twi 

Trap Word Immediate * 

tlbi 

TLB Invalidate Entry 

tlbie 

Translation Lookaside Buffer Invalidate 
Entry 

xoril 

XOR Immediate Lower 

xori 

XOR Immediate 

xoriu 

XOR Immediate Upper 

xoris 

XOR Immediate Shifted 


* Supervisor-level instruction 
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Appendix C. Multiple-Precision Shifts 

This appendix gives examples of how multiple precision shifts can be programmed. A 
multiple-precision shift is initially defined to be a shift of an n- word quantity, where n > 1 . 
The quantity to be shifted is contained in n registers. The shift amount is specified either by 
an immediate value in the instruction or by bits 27-3 1 of a register. 

The examples distinguish between the cases n = 2 and n > 2. However, if n > 2, the shift 
amount may be in the range 0-31, for the examples to yield the desired result. The specific 
instance shown for n > 2 is n = 3: extending those instruction sequences to larger n is 
straightforward, as is reducing them to the case n = 2 when the more stringent restriction 
on shift amount is met. For shifts with immediate shift amounts, only the case n = 3 is 
shown because the more stringent restriction on shift amount is always met. 

In the examples it is assumed that GPRs 2 and 3 (and 4) contain the quantity to be shifted, 
and that the result is to be placed into the same registers. For non-immediate shifts, the shift 
amount is assumed to be in bits 27-31 of GPR6. For immediate shifts, the shift amount is 
assumed to be greater than zero. GPRs 0-31 are used as scratch registers. For n > 2, the 
number of instructions required is 2 n - 1 (immediate shifts) or 3 n - 1 (non-immediate 
shifts). 

The following sections provide examples of multiple-precision shifts. 

C.1 Multiple-Precision Shifts in 32-Bit 
Implementations 

Shift Left Immediate, n = 3 (Shift Amount < 32) 

rlwinm r2 , r2 , sh, 0 , 31 - sh 
rlwimi r2,r3,sh,32 - sh, 31 
rlwinm r3, r3, sh, 0, 31 - sh 
rlwimi r3,r4,sh,32 - sh, 31 
rlwinm r4 , r4 , sh, 0 , 31 - sh 

Shift Left, n = 2 (Shift Amount < 64) 

subfic r31,r6,32 

slw r2,r2,r6 

srw r0,r3,r31 

or r2,r2,r0 

addi r31,r6,-32 
slw r0,r3,r31 

or r2,r2,r0 

slw r3,r3,r6 
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Shift Left, n = 3 (Shift Amount < 32) 

subfic r31,r6,32 

slw r2,r2,r6 

srw r0,r3,r31 
or r2,r2,r0 

slw r3,r3,r6 
srw r0,r4,r31 
or r3,r3,r0 

slw r4,r4,r6 

Shift Right Immediate, n = 3 (Shift Amount < 32) 

rlwinm r4,r4,32 - sh, sh, 31 
rlwimi r4,r3,32 - sh,0,sh - 1 
rlwinm r3,r3,32 - sh, sh, 31 
rlwimi r3,r2,32 - sh,0,sh - 1 
rlwinm r2,r2,32 - sh,sh,31 

Shift Right, n = 2 (Shift Amount < 64) 

subfic r31,r6,32 

srw r3,r3,r6 

slw r0,r2,r31 

or r3,r3,r0 

addi r31,r6, -32 
srw r0,r2,r31 

or r3,r3,r0 

srw r2,r2,r6 

Shift Right, n = 3 (Shift Amount < 32) 

subfic r31,r6,-32 
srw r4,r4,r6 

slw r0,r3,r31 

or r4,r4,r0 

srw r3,r3,r6 

slw r0,r2,r31 

or r3,r3,r0 

srw r2,r2,r6 

Shift Right Algebraic Immediate, n = 3 (Shift Amount < 32) 

rlwinm r4,r4,32 - sh, sh, 31 
rlwimi r4,r3,32 - sh,0,sh - 1 
rlwinm r3,r3,32 - sh, sh, 31 
rlwimi r3,r2,32 - sh,0,sh - 1 

srawi r2 , r2 , sh 

Shift Right Algebraic, n = 2 (Shift Amount < 64) 

subfic r31,r6,32 

srw r3,r3,r6 

slw r0,r2,r31 

or r3,r3,r0 

addic. r31,r6,-32 
sraw r0,r2,r31 

ble $+8 

ori r3,r0,0 

sraw r2,r2,r6 

Shift Right Algebraic, n = 3 (Shift Amount < 32) 

subfic r31,r6,32 

srw r4,r4,r6 

slw r0,r3,r31 

or r4,r4,r0 

srw r3,r3,r6 

slw r0,r2,r31 

or r3,r3,r0 

sraw r2,r2,r6 
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Appendix D. Floating-Point Models 

This appendix describes the execution model for IEEE operations and gives examples of 
how the floating-point conversion instructions can be used to perform various conversions 
as well as providing models for floating-point instructions. 

D.1 Execution Model for IEEE Operations 

The following description uses double-precision arithmetic as an example; single-precision 
arithmetic is similar except that the fraction field is a 23-bit field and the single-precision 
guard, round, and sticky bits (described in this section) are logically adjacent to the 23-bit 
FRACTION field. 

IEEE-conforming significand arithmetic is performed with a floating-point accumulator 
where bits 0-55, shown in Figure D-l, comprise the significand of the intermediate result. 


a 

Q 

D 

FRACTION 

E 

a 

□ 



0 

1 

52 



55 


Figure D-1. IEEE 64-Bit Execution Model 

The bits and fields for the IEEE double-precision execution model are defined as follows: 

• The S bit is the sign bit. 

• The C bit is the carry bit that captures the carry out of the significand. 

• The L bit is the leading unit bit of the significand that receives the implicit bit from 
the operands. 

• The FRACTION is a 52-bit field that accepts the fraction of the operands. 

• The guard (G), round (R), and sticky (X) bits are extensions to the low-order bits of 
the accumulator. The G and R bits are required for postnormalization of the result. 
The G, R, and X bits are required during rounding to determine if the intermediate 
result is equally near the two nearest representable values. The X bit serves as an 
extension to the G and R bits by representing the logical OR of all bits that may 
appear to the low-order side of the R bit, due to either shifting the accumulator right 
or to other generation of low-order result bits. The G and R bits participate in the left 
shifts with zeros being shifted into the R bit. 
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Table D-l shows the significance of the G, R, and X bits with respect to the intermediate 
result (IR), the next lower in magnitude representable number (NL), and the next higher in 
magnitude representable number (NH). 


Table D-1. Interpretation of G, R, and X Bits 


G 

R 

X 

Interpretation 

0 

0 

0 

IR is exact 

0 

0 

1 

IR closer to NL 

0 

1 

0 

0 

1 

1 

1 

0 

0 

IR midway between NL & NH 

1 

0 

1 

IR closer to NH 

1 

1 

0 

1 

1 

1 


The significand of the intermediate result is made up of the L bit, the FRACTION, and the 
G, R, and X bits. 

The infinitely precise intermediate result of an operation is the result normalized in bits L, 
FRACTION, G, R, and X of the floating-point accumulator. 

After normalization, the intermediate result is rounded, using the rounding mode specified 
by FPSCR[RN]. If rounding causes a carry into C, the significand is shifted right one 
position and the exponent is incremented by one. This causes an inexact result and possibly 
exponent overflow. Fraction bits to the left of the bit position used for rounding are stored 
into the FPR, and low-order bit positions, if any, are set to zero. 

Four user- selectable rounding modes are provided through FPSCR[RN] as described in 
Section 3.3.5, “Rounding.” For rounding, the conceptual guard, round, and sticky bits are 
defined in terms of accumulator bits. 

Table D-2 shows the positions of the guard, round, and sticky bits for double-precision and 
single-precision floating-point numbers in the IEEE execution model. 
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Table D-2. Location of the Guard, Round, and Sticky Bits — IEEE Execution Model 


Format 

Guard 

Round 

Sticky 

Double 

G bit 

R bit 

X bit 

Single 

24 

25 

OR of 26-52 G,R,X 


Rounding can be treated as though the significand were shifted right, if required, until the 
least-significant bit to be retained is in the low-order bit position of the FRACTION. If any 
of the guard, round, or sticky bits are nonzero, the result is inexact. 

Z1 and Z2, defined in Section 3.3.5, “Rounding,” can be used to approximate the result in 
the target format when one of the following rules is used: 

• Round to nearest 

— Guard bit = 0: The result is truncated. (Result exact (GRX = 000) or closest to 
next lower value in magnitude (GRX = 001, 010, or 01 1). 

— Guard bit = 1: Depends on round and sticky bits: 

Case a: If the round or sticky bit is one (inclusive), the result is incremented 
(result closest to next higher value in magnitude (GRX =101,110, or 111)). 

Case b: If the round and sticky bits are zero (result midway between closest 
representable values) then if the low-order bit of the result is one, the result is 
incremented. Otherwise (the low-order bit of the result is zero) the result is 
truncated (this is the case of a tie rounded to even). 

If during the round-to-nearest process, truncation of the unrounded number 
produces the maximum magnitude for the specified precision, the following action 
is taken: 

— Guard bit = 1 : Store infinity with the sign of the unrounded result. 

— Guard bit = 0: Store the truncated (maximum magnitude) value. 

• Round toward zero — Choose the smaller in magnitude of Z1 or Z2. If the guard, 
round, or sticky bit is nonzero, the result is inexact. 

• Round toward +infinity — Choose Zl. 

• Round toward -infinity — Choose Z2. 

Where the result is to have fewer than 53 bits of precision because the instruction is a 
floating round to single-precision or single-precision arithmetic instruction, the 
intermediate result either is normalized or is placed in correct denormalized form before 
being rounded. 
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D.2 Execution Model for Multiply-Add Type 
Instructions 

The PowerPC architecture makes use of a special instruction form that performs up to three 
operations in one instruction (a multiply, an add, and a negate). With this added capability 
comes the special ability to produce a more exact intermediate result as an input to the 
rounder. Single-precision arithmetic is similar except that the fraction field is smaller. Note 
that the rounding occurs only after add; therefore, the computation of the sum and product 
together are infinitely precise before the final result is rounded to a representable format. 

The multiply-add significand arithmetic is considered to be performed with a floating-point 
accumulator, where bits 1-106 comprise the significand of the intermediate result. The 
format is shown in Figure D-2. 


B 

B 

B 

FRACTION 

B 



0 

1 

105 



Figure D-2. Multiply-Add 64-Bit Execution Model 

The first part of the operation is a multiply. The multiply has two 53-bit significands as 
inputs, which are assumed to be prenormalized, and produces a result conforming to the 
above model. If there is a carry out of the significand (into the C bit), the significand is 
shifted right one position, placing the L bit into the most-significant bit of the FRACTION 
and placing the C bit into the L bit. All 106 bits (L bit plus the fraction) of the product take 
part in the add operation. If the exponents of the two inputs to the adder are not equal, the 
significand of the operand with the smaller exponent is aligned (shifted) to the right by an 
amount added to that exponent to make it equal to the other input’s exponent. Zeros are 
shifted into the left of the significand as it is aligned and bits shifted out of bit 105 of the 
significand are ORed into the X' bit. The add operation also produces a result conforming 
to the above model with the X' bit taking part in the add operation. 

The result of the add is then normalized, with all bits of the add result, except the X' bit, 
participating in the shift. The normalized result serves as the intermediate result that is input 
to the rounder. 

For rounding, the conceptual guard, round, and sticky bits are defined in terms of 
accumulator bits. Table D-3 shows the positions of the guard, round, and sticky bits for 
double-precision and single-precision floating-point numbers in the multiply-add execution 
model. 

Table D-3. Location of the Guard, Round, and Sticky Bits — Multiply-Add Execution 

Model 


Format 

Guard 

Round 

Sticky 

Double 

53 

54 

OR of 55-105, X' 

Single 

24 

25 

OR of 26-105, X' 
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The rules for rounding the intermediate result are the same as those given in Section D.l, 
“Execution Model for IEEE Operations.” 

If the instruction is floating negative multiply-add or floating negative multiply- subtract, 
the final result is negated. 

Floating-point multiply-add instructions combine a multiply and an add operation without 
an intermediate rounding operation. The fraction part of the intermediate product is 106 bits 
wide, and all 106 bits take part in the add/subtract portion of the instruction. 

Status bits are set as follows: 

• Overflow, underflow, and inexact exception bits, the FR and FI bits, and the FPRF 
field are set based on the final result of the operation, and not on the result of the 
multiplication. 

• Invalid operation exception bits are set as if the multiplication and the addition were 
performed using two separate instructions (for example, an fmul instruction 
followed by an fadd instruction). That is, multiplication of infinity by 0 or of 
anything by an SNaN, causes the corresponding exception bits to be set. 

D.3 Floating-Point Conversions 

This section provides examples of floating-point conversion instructions. Note that some of 
the examples use the optional Floating Select (fsel) instruction. Care must be taken in using 
fsel if IEEE compatibility is required, or if the values being tested can be NaNs or infinities. 

D.3.1 Conversion from Floating-Point Number to Signed Fixed-Point 
Integer Word 

The full convert to signed fixed-point integer word function can be implemented with the 
following sequence, assuming that the floating-point value to be converted is in FPR1, the 
result is returned in GPR3, and a double word at displacement (disp) from the address in 
GPR1 can be used as scratch space. 

fctiw[z] f2, fl ^convert to fx int 

stfd f2,disp(rl) #store float 

lwz r3,disp + 4(rl) #load word and zero 
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D.3.2 Conversion from Floating-Point Number to Unsigned Fixed- 
Point Integer Word 

In a 32-bit implementation, the full convert to unsigned fixed-point integer word function 
can be implemented with the sequence shown below, assuming that the floating-point value 
to be converted is in FPR1, the value zero is in FPRO, the value 2 - 1 is in FPR3, the value 
2 31 is in FPR4, the result is returned in GPR3, and a double word at displacement (disp) 
from the address in GPR1 can be used as scratch space. 


fsel f 2 , f 1 , f 1 , f 0 

fsub f5,f3,fl 

fsel f2, f5, f2, f3 

fsub f5,f2,f4 

fcmpu cr2,f2,f4 

fsel f2 , f 5, f 5, f2 

fctiw[z] f2, f2 
stfd f2,disp(rl) 

lwz r3,disp + 4(rl) 

bit cr2,$+8 

xoris r3,r3, 0x8000 


fuse 0 if < 0 
#use max if > max 

♦subtract 2**31 
♦use diff if 2**31 

♦convert to fx int 
♦store float 
♦load word 
♦add 2**31 if input 
♦was > 2**31 


D.4 Floating-Point Models 

This section describes models for floating-point instructions. 


D.4.1 Floating-Point Round to Single-Precision Model 

The following algorithm describes the operation of the Floating Round to Single-Precision 
(frsp) instruction. 

If frB[l-l 1] < 897 and frB[l-63] > 0 then 
Do 

If FPSCR[UE] = 0 then goto Disabled Exponent Underflow 

If FPSCR[UE] = 1 then goto Enabled Exponent Underflow 
End 

If frB[l— 1 1] > 1 150 and frB[l-l 1] < 2047 then 
Do 

If FPSCR[OE] = 0 then goto Disabled Exponent Overflow 

If FPSCR[OE| = 1 then goto Enabled Exponent Overflow 
End 

If frB[l— 1 1] > 896 and frB[l-l 1] <1151 then goto Normal Operand 

If frB[l-63] = 0 then goto Zero Operand 

If frB[l-l 1] = 2047 then 
Do 

If frB[ 12-63] = 0 then goto Infinity Operand 

If frB[12] = 1 then goto QNaN Operand 

If frB[12] = 0 and frB[13-63] > 0 then goto SNaN Operand 
End 

Disabled Exponent Underflow: 

sign <— frB[0] 

If frB[l— 1 1] = 0 then 
Do 

exp < 1022 

frac[0-52] f- ObO II frB[12-63] 

End 

If frB[l-ll] > 0 then 
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Do 

exp 4— frB[l— 1 1] - 1023 
frac[0-52] 4- Obi II frB[ 12-63] 

End 

Denormalize operand: 

G II R II X < — ObOOO 
Do while exp < -126 
exp 4— exp + 1 

frac[0-52] II G II R II X < — ObO II frac II G II (R I X) 

End 

FPSCRfUX] 4- frac [24-52] II G II R II X > 0 
Round single(sign,exp,frac[0-52],G,R,X) 

FPSCR[XX] 4- FPSCR[XX] I FPSCR[FI] 

If frac[0-52] = 0 then 
Do 

frD[0] 4- sign 
frD[l-63] 4- 0 

If sign = 0 then FPSCR[FPRF] 4- “+zero” 

If sign = 1 then FPSCR[FPRF | <- “-zero” 

End 

If frac[0-52] > 0 then 
Do 

If frac[0] = 1 then 
Do 

If sign = 0 then FPSCR[FPRF] 4- “+normal number” 

If sign = 1 then FPSCR[FPRF] 4— “-normal number” 

End 

If frac[0] = 0 then 
Do 

If sign = 0 then FPSCR[FPRF] 4— “+denormalized number” 
If sign = 1 then FPSCR[FPRF] 4— “-denormalized number” 
End 

Normalize operand: 

Do while frac[0] = 0 
exp 4— exp - 1 

frac [0-52] 4- frac [1-52] II ObO 
End 

frD[0] 4— sign 

frDf 1—1 1] 4— exp + 1023 

frD[ 12-63] 4 — frac[l-52] 

End 

Done 

Enabled Exponent Underflow 

FPSCRfUX] 4- 1 
sign 4— frB[0] 

If frB[l— 1 1] = 0 then 
Do 

exp 4 1022 

frac [0-52] 4- ObO II frB[ 12-63] 

End 

If frB[l— 1 1] > 0 then 
Do 

exp 4— frB[l-l 1] - 1023 
frac [0-52] 4- Obi II frB[ 12-63] 

End 

Normalize operand: 

Do while frac[0] = 0 
exp 4— exp - 1 

frac [0-52] 4- frac [1-52] II ObO 
End 

Round single(sign,exp,frac [0-52] ,0,0,0) 

FPSCRfXX] 4- FPSCR[XX] I FPSCR[FI] 
exp 4— exp +192 
frD[0] 4- sign 
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frD[l-ll] 4-exp + 1023 
frD[ 12-63] 4- frac[l-52] 

If sign = 0 then FPSCR[FPRF] <— “+normal number” 

If sign = 1 then FPSCRfFPRF] 4- “-normal number” 

Done 

Disabled Exponent Overflow 

FPSCRfOX] 4- 1 

If FPSCR[RN] = ObOO then /* Round to Nearest */ 

Do 

If frB[0] = 0 then frD 4- 0x7FF0_0000_0000_0000 
If frBfO] = 1 then frD 4- 0xFFF0_0000_0000_0000 
If frBfO] = 0 then FPSCR[FPRF] 4- “+infinity” 

If frB[0] = 1 then FPSCR[FPRF] 4- “-infinity” 

End 

If FPSCRfRN] = ObOl then /* Round Truncate */ 

Do 

If frBfO] = 0 then frD 4- 0x47EF_FFFF_E000_0000 
If frB[0] = 1 then frD 4- 0xC7EF_FFFF_E000_0000 
If frBfO] = 0 then FPSCR[FPRF] 4- “+normal number” 
If frB[0] = 1 then FPSCR[FPRF] 4- “-normal number” 
End 

If FPSCR[RN] = OblO then /* Round to +Infinity */ 

Do 

If frB[0] = 0 then frD 4- 0x7FF0_0000_0000_0000 
If frB[0] = 1 then frD 4- 0xC7EF_FFFF_E000_0000 
If frB[0] = 0 then FPSCR[FPRF] <4- “+infinity” 

If frBfO] = 1 then FPSCR[FPRF] 4- “-normal number” 
End 

If FPSCR[RN] = Obi 1 then /* Round to -Infinity */ 

Do 

If frB[0] = 0 then frD 4- 0x47EF_FFFF_E000_0000 
If frB[0] = 1 then frD 4- 0xFFF0_0000_0000_0000 
If frBfO] = 0 then FPSCR[FPRF] 4— “+normal number” 
If frBfO] = 1 then FPSCR[FPRF] 4- “-infinity” 

End 

FPSCRfFR] 4- undefined 
FPSCR[FI] 4- 1 
FPSCRfXX] 4- 1 
Done 

Enabled Exponent Overflow 

sign 4— frB[0] 

exp 4- frB [1—1 1 ] - 1023 

frac[0-52] 4- Obi II frB[12-63] 

Round single(sign,exp,frac [0-52] ,0,0,0) 

FPSCR[XX] 4- FPSCRfXX] I FPSCRfFI] 

Enabled Overflow 
FPSCRfOX] 4- 1 
exp 4— exp - 192 
frDfO] 4— sign 
frD| 1-11] 4— exp + 1023 
frD[ 12-63] 4- frac[l-52] 

If sign = 0 then FPSCRfFPRF] 4— “+normal number” 

If sign = 1 then FPSCRfFPRF] 4— “-normal number” 
Done 

Zero Operand 

frD 4- frB 

If frB [ 0 ] = 0 then FPSCRfFPRF] 4- " + zero" 

If frB [ 0 ] = 1 then FPSCRfFPRF] 4- "-zero" 

FPSCRfFR FI] 4- ObOO 

Done 
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Infinity Operand 

frD 4— frB 

If frB [ 0 ] = 0 then FPSCR[FPRF] 4- " + infinity" 

If frB [ 0 ] = 1 then FPSCR[FPRF] 4- "-infinity" 

Done 

QNaN Operand: 

frD 4- frB [0-34] II 0b0_0000_0000_0000_0000_0000_0000_0000 
FPSCR[FPRF] 4- “QNaN” 

FPSCR[FR FI] 4- ObOO 
Done 

SNaN Operand 

FPSCRfVXSNAN] 4- 1 
If FPSCRfVE] = 0 then 
Do 

frD[0-ll] 4— frB[0-ll] 
frD [12] 4- 1 

frD [13-63] 4- frB [13-34] II 0b0_0000_0000_0000_0000_0000_0000_0000 
FPSCRfFPRF] 4- “QNaN” 

End 

FPSCRfFR FI] 4- ObOO 
Done 

Normal Operand 

sign 4— frB [ 0 ] 

exp 4- frB [1-11] - 1023 

f rac [0-52 ] 4- Obi || frB[12-63] 

Round single (sign, exp, frac [0-52], 0,0,0) 

FPSCR [XX] 4- FPSCR [XX] | FPSCR[FI] 

If exp > +127 and FPSCR [OE] = 0 then go to Disabled Exponent Overflow 

If exp > +127 and FPSCR [OE] = 1 then go to Enabled Overflow 

frD [ 0 ] 4— sign 

frD [1-11] 4-exp + 1023 

frD [ 12-63 ] 4- frac [1-52] 

If sign = 0 then FPSCRfFPRF] 4— "+normal number" 

If sign = 1 then FPSCR[FPRF] 4— "-normal number" 

Done 

Round Single (sign, exp, frac[0-52],G,R,X) 

inc 4- 0 
lsb 4- frac [23] 
gbit 4— frac [24] 
rbit 4— frac [25] 

xbit 4- (frac [26-52] II G II R II X) 0 
If FPSCRfRN] = ObOO then 
Do 

If sign II lsb II gbit II rbit II xbit = Obulluu then inc 4— 1 

If sign II lsb II gbit II rbit II xbit = ObuOl lu then inc 4— 1 

If sign II lsb II gbit II rbit II xbit = ObuOlul then inc 4— 1 

End 

If FPSCRfRN] = OblO then 
Do 

If sign II lsb II gbit II rbit II xbit = ObOuluu then inc 4— 1 

If sign II lsb II gbit II rbit II xbit = ObOuulu then inc 4— 1 

If sign II lsb II gbit II rbit II xbit = ObOuuul then inc 4— 1 

End 

If FPSCRfRN] = Obi 1 then 
Do 

If sign II lsb II gbit II rbit II xbit = Obluluu then inc 4— 1 

If sign II lsb II gbit II rbit II xbit = Obluulu then inc 4— 1 

If sign II lsb II gbit II rbit II xbit = Obluuul then inc 4— 1 

End 
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frac[0-23] 4— frac[0-23] + inc 
If carry _out =1 then 
Do 

frac[0-23] 4- Obi II frac[0-22] 
exp 4— exp + 1 
End 

frac[24-52] 4- (29)0 
FPSCR[FR] 4- inc 
FPSCR[FI] 4- gbit I rbit I xbit 
Return 

D.4.2 Floating-Point Convert to Integer Model 

The following algorithm describes the operation of the floating-point convert to integer 
instructions. In this example, ‘u’ represents an undefined hexadecimal digit. 

If Floating Convert to Integer Word 
Then Do 

Then round_mode 4— FPSCR[RN] 
tgt_precision 4— “32-bit integer” 

End 

If Floating Convert to Integer Word with round toward Zero 
Then Do 

round_mode 4— ObOl 
tgt_precision 4— “32-bit integer” 

End 

If Floating Convert to Integer Double Word 
Then Do 

round_mode 4- FPSCR[RN] 
tgt_precision 4— “64-bit integer” 

End 

If Floating Convert to Integer Double Word with Round toward Zero 
Then Do 

round_mode 4— ObOl 
tgt_precision 4— “64-bit integer” 

End 

sign 4— frB [0] 

If frB [1-1 1] = 2047 and frB [12-63] = 0 then goto Infinity Operand 
If frB [1-1 1] = 2047 and frB [12] = 0 then goto SNaN Operand 
If frB [1-1 1] = 2047 and frB [12] = 1 then goto QNaN Operand 
If frB[l-l 1] > 1054 then goto Farge Operand 


D 


If frB [1-11] > 0 then exp 4- frB[l-l 1] - 1023 /* exp - bias */ 

If frB [ 1-1 1 ] = 0 then exp -1022 

If frB [1-11] > 0 then frac[0-64]^- ObOl II frB[12-63] II (11)0 /*normal*/ 

If frB [1-11] = 0 then frac[0-64]4- ObOO II frB[12-63] II (1 1)0 /*denormal*/ 


gbit II rbit II xbit 4- ObOOO 

Do i = 1,63 - exp /*do the loop 0 times if exp = 63*/ 

frac[0-64] II gbit II rbit II xbit 4— ObO II frac[0-64] II gbit II (rbit I xbit) 

End 

Round Integer (sign, frac[0-64], gbit, rbit, xbit, round_mode) 

In this example, ‘u’ represents an undefined hexadecimal digit. Comparisons ignore the u 
bits. 

If sign = 1 then frac[0-64] 4— -frac[0-64] + 1 /* needed leading 0 for -2^ < frB < -2^*/ 

31 

If tgt_precision = “32-bit integer” and frac[0-64] > +2 - 1 

then goto Farge Operand 
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If tgt_precision = “64-bit integer” and frac[0-64] > +2° - 1 
then goto Large Operand 

31 

If tgt_precision = “32-bit integer” and frac[0-64] < -2 then goto Large Operand 
FPSCRfXX] 4- FPSCR[XX] I FPSCR[FI] 

63 

If tgt_precision = “64-bit integer” and frac[0-64] < -2 then goto Large Operand 
If tgt_precision = “32-bit integer” 

then frD 4- Oxxuuu_uuuu II frac [33-64] 

If tgt_precision = “64-bit integer” then frD 4— frac[l-64] 

FPSCRfFPRF] 4- undefined 
Done 

Round Integer(sign,frac[0-64],gbit,rbit,xbit,round_mode) 

In this example, ‘u’ represents an undefined hexadecimal digit. Comparisons ignore the u 
bits. 

inc 4— 0 

If round_mode = ObOO then 
Do 

If sign II frac [64] II gbit II rbit II xbit = Obul luu then inc 4— 1 
If sign II frac [64] II gbit II rbit II xbit = ObuOl lu then inc 4— 1 
If sign II frac [64] II gbit II rbit II xbit = ObuOlul then inc 4— 1 
End 

If round_mode = Ob 10 then 
Do 

If sign II frac [64] II gbit II rbit II xbit = ObOuluu then inc 4—1 
If sign II frac [64] II gbit II rbit II xbit = ObOuulu then inc 4— 1 
If sign II frac [64] II gbit II rbit II xbit = ObOuuul then inc 4— 1 
End 

If round_mode = Ob 1 1 then 
Do 

If sign II frac [64] II gbit II rbit II xbit = Obluluu then inc 4— 1 
If sign II frac [64] II gbit II rbit II xbit = Obluulu then inc 4— 1 
If sign II frac [64] II gbit II rbit II xbit = Obluuul then inc 4— 1 
End 

frac [0-64] 4— frac [0-64] + inc 
FPSCR[FR] 4- inc 
FPSCR[FI] 4- gbit I rbit I xbit 
Return 

Infinity Operand 

FPSCRfFR FI VXCVI] 4- ObOOl 
If FPSCRfVE] = 0 then Do 

If tgt_precision = “32-bit integer” then 
Do 

If sign = 0 then frD 4- 0xuuuu_uuuu_7FFF_FFFF 
If sign = 1 then frD 4— 0xuuuu_uuuu_8000_0000 
End 
Else 
Do 

If sign = 0 then frD 4- 0x7FFF_FFFF_FFFF_FFFF 
If sign = 1 then frD 4— 0x8000_0000_0000_0000 
End 

FPSCRfFPRF] 4- undefined 
End 
Done 

SNaN Operand 

FPSCRfFR FI VXCVI VXSNAN] 4- ObOOl 1 
If FPSCR[VE] = 0 then 
Do 
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If tgt_precision = “32-bit integer” 
then frD 4- 0xuuuu_uuuu_8000_0000 
If tgt_precision = “64-bit integer” 
then frD 4- 0x8000_0000_0000_0000 
FPSCRfFPRF] 4- undefined 
End 
Done 

QNaN Operand 

FPSCRfFR FI VXCVI] 4- ObOOl 
If FPSCRfVE] = 0 then 
Do 

If tgt_precision = “32-bit integer” then frD 4— 0xuuuu_uuuu_8000_0000 
If tgt_precision = “64-bit integer” then frD 4— 0x8000_0000_0000_0000 
FPSCRfFPRF] 4- undefined 
End 
Done 

Large Operand 

FPSCR[FR FI VXCVI] 4- ObOOl 
If FPSCRfVE] = 0 then Do 

If tgt_precision = “32-bit integer” then 
Do 

If sign = 0 then frD 4— 0xuuuu_uuuu_7FFF_FFFF 
If sign = 1 then frD 4— 0xuuuu_uuuu_8000_0000 
End 
Else 
Do 

If sign = 0 then frD 4— 0x7FFF_FFFF_FFFF_FFFF 
If sign = 1 then frD 4— 0x8000_0000_0000_0000 
End 

FPSCRfFPRF] 4- undefined 
End 
Done 

D.4.3 Floating-Point Convert from Integer Model 

The following describes, algorithmically, the operation of the floating-point convert from 
integer instructions. 

sign 4- frBfO] 
exp 4— 63 
frac[0-63] 4- frB 

If frac[0-63] = 0 then go to Zero Operand 

If sign = 1 then frac[0-63] 4- -frac[0-63] + 1 

Do while fracfO] = 0 

frac[0-63] 4- frac[l-63] II 'O’ 
exp 4- exp - 1 

End 

Round Float(sign,exp,frac[0-63],FPSCR[RN]) 

If sign = 1 then FPSCRfFPRF] 4- “-normal number” 

If sign = 0 then FPSCRfFPRF] 4— “+normal number” 
frDfO] 4— sign 
frDfl-11] 4— exp + 1023 
frDf 12-63] 4— frac[l-52] 

Done 

Zero Operand 

FPSCRfFR FI] 4 — ObOO 
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FPSCRfFPRF] 4- “+zero” 

frD 4- 0x0000_0000_0000_0000 

Done 

Round Float(sign,exp,frac[0-63],round_mode) 

In this example ‘u’ represents an undefined hexadecimal digit. Comparisons ignore the u 
bits. 

inc 4 — 0 
lsb 4- frac[52] 
gbit 4 — frac[53] 
rbit 4— frac[54] 
xbit 4— frac[55-63] > 0 
If round_mode = ObOO then 
Do 

If sign II lsb II gbit II rbit II xbit = Obulluu then inc 4— 1 
If sign II lsb II gbit II rbit II xbit = ObuOl lu then inc 4— 1 
If sign II lsb II gbit II rbit II xbit = ObuOlul then inc 4— 1 
End 

If round_mode = Ob 10 then 
Do 

If sign II lsb II gbit II rbit II xbit = ObOuluu then inc 4— 1 
If sign II lsb II gbit II rbit II xbit = ObOuulu then inc 4— 1 
If sign II lsb II gbit II rbit II xbit = ObOuuul then inc 4— 1 
End 

If round_mode = Obi 1 then 
Do 

If sign II lsb II gbit II rbit II xbit = Obluluu then inc 4— 1 
If sign II lsb II gbit II rbit II xbit = Obluulu then inc 4— 1 
If sign II lsb II gbit II rbit II xbit = Obluuul then inc 4— 1 
End 

frac[0-52] 4— frac[0-52] + inc 
If carry _out = 1 then exp 4— exp + 1 
FPSCRfFR] 4- inc 
FPSCRfFI] 4- gbit I rbit I xbit 
FPSCR[XX] 4 - FPSCR[XX] I FPSCR[FI] 

Return 


D.5 Floating-Point Selection 

The following are examples of how the optional fsel instruction can be used to implement 
floating-point minimum and maximum functions, and certain simple forms of if-then-else 
constructions, without branching. 

The examples show program fragments in an imaginary, C-like, high-level programming 
language, and the corresponding program fragment using fsel and other PowerPC 
instructions. In the examples, a , b , x, y, and z are floating-point variables, which are 
assumed to be in FPRs fa, fb, fx, fy, and fz. FPR fs is assumed to be available for scratch 
space. 

Additional examples can be found in Section D.3, “Floating-Point Conversions.” 

Note that care must be taken in using fsel if IEEE compatibility is required, or if the values 
being tested can be NaNs or infinities; see Section D.5.4, “Notes.” 
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D.5.1 Comparison to Zero 

This section provides examples in a program fragment code sequence for the comparison 
to zero case. 


High-level language: PowerPC: 

if a > 0.0 then x < — y fsel fx, fa, fy, fz (see Section D.5.4, “Notes” number 1) 

else x z 


if a > 0.0 then x <— y 
else x <— z 


fneg fs, fa 

fsel fx, fs, fz, fy (see Section D.5.4, “Notes” numbers 1 and 2) 


if a = 0.0 then x <— y 
else x z 


fsel fx, fa, fy, fz 
fneg fs, fa 

fsel fx, fs, fx, fz (see Section D.5.4, “Notes” number 1) 


D.5.2 Minimum and Maximum 

This section provides examples in a program fragment code sequence for the minimum and 
maximum cases. 


High-level language: PowerPC: 

x < — min(a, b) fsub fs, fa, fb (see Section D.5.4, “Notes” numbers 3, 4, and 5) 

fsel fx, fs, fb, fa 

x < — max(a, b) fsub fs, fa, fb (see Section D.5.4, “Notes” numbers 3, 4, and 5) 

fsel fx, fs, fa, fb 


D.5.3 Simple If-Then-Else Constructions 

This section provides examples in a program fragment code sequence for simple if-then- 
else statements. 

High-level language: PowerPC: 


if a > b then x <— y 

fsub 

fs, fa, fb 


else x <— z 

fsel 

fx, fs, fy, fz (see Section D.5.4, 

“Notes” numbers 4 and 5) 

if a >b then x < — y 

fsub 

fs, fb, fa 


else x <— z 

fsel 

fx, fs, fz, fy (see Section D.5.4, 

“Notes” numbers 3, 4, and 5) 

if a = b then x <— y 

fsub 

fs, fa, fb 


else x z 

fsel 

fx, fs, fy, fz 



fneg 

fs, fs 



fsel 

fx, fs, fx, fz (see Section D.5.4, 

“Notes” numbers 4 and 5) 

D.5.4 Notes 





The following notes apply to the examples found in Section D.5.1, “Comparison to Zero,” 
Section D.5.2, “Minimum and Maximum,” and Section D.5.3, “Simple If-Then-Else 
Constructions,” and to the corresponding cases using the other three arithmetic relations (<, 
<, and^). These notes should also be considered when any other use of fsel is contemplated. 
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In these notes the “optimized program” is the PowerPC program shown, and the 
“unoptimized program” (not shown) is the corresponding PowerPC program that uses 
fcmpu and branch conditional instructions instead of fsel. 

1 . The unoptimized program affects the VXSNAN bit of the FPSCR, and therefore 
may cause the system error handler to be invoked if the corresponding exception is 
enabled, while the optimized program does not affect this bit. This property of the 
optimized program is incompatible with the IEEE standard. (Note that the 
architecture specification also refers to exceptions as interrupts.) 

2. The optimized program gives the incorrect result if ‘a’ is a NaN. 

3. The optimized program gives the incorrect result if ‘a’ and/or ‘b’ is a NaN (except 
that it may give the correct result in some cases for the minimum and maximum 
functions, depending on how those functions are defined to operate on NaNs). 

4. The optimized program gives the incorrect result if ‘a’ and ‘b’ are infinities of the 
same sign. (Here it is assumed that invalid operation exceptions are disabled, in 
which case the result of the subtraction is a NaN. The analysis is more complicated 
if invalid operation exceptions are enabled, because in that case the target register of 
the subtraction is unchanged.) 

5 . The optimized program affects the OX, UX, XX, and VXISI bits of the FPSCR, and 
therefore may cause the system error handler to be invoked if the corresponding 
exceptions are enabled, while the unoptimized program does not affect these bits. 
This property of the optimized program is incompatible with the IEEE standard. 

D.6 Floating-Point Load Instructions 

There are two basic forms of load instruction — single-precision and double-precision. 
Because the FPRs support only floating-point double format, single-precision load floating- 
point instructions convert single-precision data to double-precision format prior to loading 
the operands into the target FPR. The conversion and loading steps follow: 

Let WORD[0-31] be the floating point single-precision operand accessed from memory. 

Normalized Operand 

If WORD [1-8] > 0 and WORD [1-8] < 255 
frD [ 0-1 ] <— WORD [0-1] 
f rD [ 2 ] <— -> WORD [ 1 ] 
frD [ 3 ] <— -> WORD [ 1 ] 
frD [ 4 ] <— -i WORD [ 1 ] 
frD [5-63] < — WORD [2-31] || (29)0 

Denormalized Operand 

If WORD [1-8] = 0 and WORD[9-31]^0 
sign < — WORD [ 0 ] 
exp < — -12 6 

f rac [ 0-52 ] <- ObO || WORD[9-31] || (29)0 

normalize the operand 
Do while frac[0] = 0 

frac <— frac[l-52] | | ObO 
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exp <— exp - 
End 

frD[0] <— sign 
frD [1-11] <3- exp + 1023 
frD [ 12-63 ] <5- f rac [ 1-52 ] 

Infinity / QNaN / SNaN / Zero 

If WORD [1-8] = 255 or WORD [1-31] = 0 
frD [0-1] <— WORD [0-1] 
frD [ 2 ] <— WORD [ 1 ] 
frD [ 3 ] <— WORD [ 1 ] 
frD [ 4 ] <— WORD [ 1 ] 
frD [5-63] < — WORD [2-31 ] || (29)0 

For double-precision floating-point load instructions, no conversion is required as the data 
from memory is copied directly into the FPRs. 

Many floating-point load instructions have an update form in which register rA is updated 
with the EA. For these forms, if operand rAA), the effective address (EA) is placed into 
register rA and the memory element (word or double word) addressed by the EA is loaded 
into the floating-point register specified by operand frD; if operand rA = 0, the instruction 
form is invalid. 

Recall that rA, rB, and rD denote GPRs, while frA, frB, frC, frS, and frD denote FPRs. 


D 
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D.7 Floating-Point Store Instructions 

There are three basic forms of store instruction — single-precision, double-precision, and 
integer. The integer form is provided by the optional stfiwx instruction. Because the FPRs 
support only floating-point double format for floating-point data, single-precision store 
floating-point instructions convert double-precision data to single-precision format prior to 
storing the operands into memory. The conversion steps follow: 

Let WORD [0-31] be the word written to in memory. 

No Denormalization Required (includes Zero/Infinity/NaN) 

if f rS [1-11] > 896 or frS[l-63] = 0 then 
WORD [0-1] frS [ 0-1 ] 

WORD [2-31 ] <r- frS [5-34 ] 

Denormalization Required 

if 874 < f rS [1-11] < 896 then 
sign frS [ 0 ] 
exp <— f rS [1-11] - 1023 
frac <r- Obi | | frS [12-63] 

Denormalize operand 

Do while exp < -126 

frac <— ObO | | frac[0-62] 
exp < — exp + 1 

End 

WORD [ 0 ] < — sign 
WORD [1-8] <— 0x00 
WORD [ 9-31 ] <r- frac [1-23] 
else WORD < — undefined 

Notice that if the value to be stored by a single-precision store floating-point instruction is 
larger in magnitude than the maximum number representable in single format, the first case 
mentioned, “No Denormalization Required,” applies. The result stored in WORD is then a 
well-defined value, but is not numerically equal to the value in the source register (that is, 
the result of a single-precision load floating-point from WORD will not compare equal to 
the contents of the original source register). 

Note that the description of conversion steps presented here is only a model. The actual 
implementation may vary from this description but must produce results equivalent to what 
this model would produce. 

It is important to note that for double-precision store floating-point instructions and for the 
store floating-point as integer word instruction no conversion is required as the data from 
the FPR is copied directly into memory. 
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Appendix E. Synchronization 
Programming Examples 

The examples in this appendix show how synchronization instructions can be used to 
emulate various synchronization primitives and how to provide more complex forms of 
synchronization. 

For each of these examples, it is assumed that a similar sequence of instructions is used by 
all processes requiring synchronization of the accessed data. 

E.1 General Information 

The following points provide general information about the lwarx and stwcx. instructions: 

• In general, lwarx and stwcx. instructions should be paired, with the same effective 
address (EA) used for both. The only exception is that an unpaired stwcx. instruction 
to any (scratch) effective address can be used to clear any reservation held by the 
processor. 

• It is acceptable to execute a lwarx instruction for which no stwcx. instruction is 
executed. Such a dangling lwarx instruction occurs in the example shown in 
Section E.2.5, “Test and Set,” if the value loaded is not zero. 

• To increase the likelihood that forward progress is made, it is important that looping 
on lwarx/stwcx. pairs be minimized. For example, in the sequence shown in 
Section E.2.5, “Test and Set,” this is achieved by testing the old value before 
attempting the store — were the order reversed, more stwcx. instructions might be 
executed, and reservations might more often be lost between the lwarx and the 
stwcx. instructions. 

• The manner in which lwarx and stwcx. are communicated to other processors and 
mechanisms, and between levels of the memory subsystem within a given processor, 
is implementation-dependent. In some implementations, performance may be 
improved by minimizing looping on an lwarx instruction that fails to return a 
desired value. For example, in the example provided in Section E.2.5, “Test and 
Set,” if the program stays in the loop until the word loaded is zero, the programmer 
can change the “bne- $+12” to “bne- loop.” 

In some implementations, better performance may be obtained by using an ordinary 
load instruction to do the initial checking of the value, as follows: 

loop: lwz r5,0(r3) #load the word 
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cmpwi 

r5, 0 

#loop back if word 

bne- 

loop 

#not equal to 0 

lwarx 

r5, 0, r3 

#try again, reserving 

cmpwi 

r5, 0 

# (likely to succeed) 

bne 

loop 

#try to store nonzero 

stwcx . 

r4, 0, r3 

# 

bne- 

loop 

#loop if lost reservat. 


• In a multiprocessor, livelock (a state in which processors interact in a way such that 
no processor makes progress) is possible if a loop containing an lwarx/stwcx. pair 
also contains an ordinary store instruction for which any byte of the affected 
memory area is in the reservation granule of the reservation. For example, the first 
code sequence shown in Section E.5, “List Insertion,” can cause livelock if two list 
elements have next element pointers in the same reservation granule. 

E.2 Synchronization Primitives 

The following examples show how the lwarx and stwcx. instructions can be used to 
emulate various synchronization primitives. The sequences used to emulate the various 
primitives consist primarily of a loop using the lwarx and stwcx. instructions. Additional 
synchronization is unnecessary, because the stwcx. will fail, clearing the EQ bit, if the word 
loaded by lwarx has changed before the stwcx. is executed. 

E.2.1 Fetch and No-Op 

The fetch and no-op primitive atomically loads the current value in a word in memory. In 
this example, it is assumed that the address of the word to be loaded is in GPR3 and the data 
loaded are returned in GPR4. 

loop: lwarx r4,0,r3 #load and reserve 

stwcx. r4,0,r3 #store old value if still reserved 
bne- loop #loop if lost reservation 

The stwcx., if it succeeds, stores to the destination location the same value that was loaded 
by the preceding lwarx. While the store is redundant with respect to the value in the 
location, its success ensures that the value loaded by the lwarx was the current value (that 
is, the source of the value loaded by the lwarx was the last store to the location that 
preceded the stwcx. in the coherence order for the location). 
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E.2.2 Fetch and Store 

The fetch and store primitive atomically loads and replaces a word in memory. 

In this example, it is assumed that the address of the word to be loaded and replaced is in 
GPR3, the new value is in GPR4, and the old value is returned in GPR5. 

loop: lwarx r5,0,r3 #load and reserve 

stwcx . r4,0,r3 #store new value if still reserved 

bne- loop #loop if lost reservation 

E.2.3 Fetch and Add 

The fetch and add primitive atomically increments a word in memory. 

In this example, it is assumed that the address of the word to be incremented is in GPR3, 
the increment is in GPR4, and the old value is returned in GPR5. 

loop: lwarx r5,0,r3 #load and reserve 

add r0,r4,r5 #increment word 

stwcx. r0,0,r3 #store new value if still reserved 

bne- loop #loop if lost reservation 

E.2.4 Fetch and AND 

The fetch and AND primitive atomically ANDs a value into a word in memory. 

In this example, it is assumed that the address of the word to be ANDed is in GPR3, the 
value to AND into it is in GPR4, and the old value is returned in GPR5. 

loop: lwarx r5,0,r3 #load and reserve 

and r0,r4,r5 #AND word 

stwcx. r0,0,r3 #store new value if still reserved 

bne- loop #loop if lost reservation 

This sequence can be changed to perform another Boolean operation atomically on a word 
in memory, simply by changing the AND instruction to the desired Boolean instruction 
(OR, XOR, etc.). 

E.2.5 Test and Set 

This version of the test and set primitive atomically loads a word from memory, ensures that 
the word in memory is a nonzero value, and sets CR0[EQ] according to whether the value 
loaded is zero. 

In this example, it is assumed that the address of the word to be tested is in GPR3, the new 
value (nonzero) is in GPR4, and the old value is returned in GPR5. 

loop: lwarx r5,0,r3 #load and reserve 

cmpwi r5, 0 #done if word 

bne $+12 #not equal to 0 

stwcx. r4,0,r3 #try to store non-zero 

bne- loop #loop if lost reservation 
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E.3 Compare and Swap 

The compare and swap primitive atomically compares a value in a register with a word in 
memory. If they are equal, it stores the value from a second register into the word in 
memory. If they are unequal, it loads the word from memory into the first register, and sets 
the EQ bit of the CRO field to indicate the result of the comparison. 

In this example, it is assumed that the address of the word to be tested is in GPR3, the word 
that is compared is in GPR4, the new value is in GPR5, and the old value is returned in 


GPR4. 




loop : 

lwarx 

r6, 0, r3 

#load and reserve 


cmpw 

r4 , r6 

#first 2 operands equal? 


bne- 

exit 

#skip if not 


stwcx . 

r5, 0, r3 

#store new value if still reserved 


bne- 

loop 

#loop if lost reservation 

exit : 

Notes: 

mr 

r4 , r6 

freturn value from memory 


1 . The semantics in this example are based on the IBM System/370™ compare and 
swap instruction. Other architectures may define this instruction differently. 

2. Compare and swap is shown primarily for pedagogical reasons. It is useful on 
machines that lack the better synchronization facilities provided by the lwarx and 
stwcx. instructions. Although the instruction is atomic, it checks only for whether 
the current value matches the old value. An error can occur if the value had been 
changed and restored before being tested. 

3. In some applications, the second bne- instruction and/or the mr instruction can be 
omitted. The first bne- is needed only if the application requires that if the EQ bit of 
CRO field on exit indicates not equal, then the original compared value in r4 and r6 
are in fact not equal. The mr is needed only if the application requires that if the 
compared values are not equal, then the word from memory is loaded into the 
register with which it was compared (rather than into a third register). If either, or 
both, of these instructions is omitted, the resulting compare and swap does not obey 
the IBM System/370 semantics. 
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E.4 Lock Acquisition and Release 

This example provides an algorithm for locking that demonstrates the use of 
synchronization with an atomic read/modify/write operation. GPR3 provides a shared 
memory location, the address of which is an argument of the lock and unlock procedures. 
This argument is used as a lock to control access to some shared resource such as a data 
structure. The lock is open when its value is zero and locked when it is one. Before 
accessing the shared resource, a processor sets the lock by having the lock procedure call 
TEST_AND_SET, which executes the code sequence in Section E.2.5, “Test and Set.” This 
atomically sets the old value of the lock, and writes the new value (1) given to it in GPR4, 
returning the old value in GPR5 (not used in the following example) and setting the EQ bit 
in CRO according to whether the value loaded is zero. The lock procedure repeats the test 
and set procedure until it successfully changes the value in the lock from zero to one. 

The processor must not access the shared resource until it sets the lock. After the bne- 
instruction that checks for the successful test and set operation, the processor executes the 
isync instruction. This delays all subsequent instructions until all previous instructions have 
completed to the extent required by context synchronization. The sync instruction could be 
used but performance would be degraded because the sync instruction waits for all 
outstanding memory accesses to complete with respect to other processors. This is not 
necessary here. 


lock : 

li 

r4, 1 

#obtain lock 

loop : 

bl 

test_and_set 

#test and set 


bne- 

loop 

#retry until old = 0 

#delay subsequent instructions 

tprevious ones complete 


isync 

blr 


#return 


The unlock procedure writes a zero to the lock location. If the access to the shared resource 
includes write operations, most applications that use locking require the processor to 
execute a sync instruction to make its modification visible to all processors before releasing 
the lock. For this reason, the unlock procedure in the following example begins with a sync. 


unlock: sync 

li rl , 0 

stw rl , 0 (r3 ) 

blr 


#delay until prior stores finish 

tstore zero to lock location 
freturn 
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E.5 List Insertion 

The following example shows how the lwarx and stwcx. instructions can be used to 
implement simple LIFO (last-in-first-out) insertion into a singly-linked list. (Complicated 
list insertion, in which multiple values must be changed atomically, or in which the correct 
order of insertion depends on the contents of the elements, cannot be implemented in the 
manner shown below, and requires a more complicated strategy such as using locks.) 

The next element pointer from the list element after which the new element is to be inserted, 
here called the parent element, is stored into the new element, so that the new element 
points to the next element in the list — this store is performed unconditionally. Then the 
address of the new element is conditionally stored into the parent element, thereby adding 
the new element to the list. 

In this example, it is assumed that the address of the parent element is in GPR3, the address 
of the new element is in GPR4, and the next element pointer is at offset zero from the start 
of the element. It is also assumed that the next element pointer of each list element is in a 
reservation granule separate from that of the next element pointer of all other list elements. 

loop: lwarx r2,0,r3 #get next pointer 

stw r2 , 0 (r4 ) #store in new element 

sync #let store settle (can omit if not MP) 

stwcx. r4,0,r3 #add new element to list 
bne- loop #loop if stwcx. failed 

In the preceding example, if two list elements have next element pointers in the same 
reservation granule in a multiprocessor system, livelock can occur. 

If it is not possible to allocate list elements such that each element’s next element pointer 
is in a different reservation granule, livelock can be avoided by using the following 


sequence: 




lwz 

r2,0(r3)#get next pointer 

loopl : 

mr 

r5, r2 

#keep a copy 


stw 

r2 , 0 (r4 ) #store in new element 


sync 


#let store settle 

loop2 : 

lwarx 

r2, 0, r3 

#get it again 


cmpw 

r2, r5 

#loop if changed (someone 


bne- 

loopl 

#else progressed) 


stwcx . 

r4, 0, r3 

#add new element to list 


bne- 

loop2 

#loop if failed 
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Appendix F. Simplified Mnemonics 

This appendix is provided in order to simplify the writing and comprehension of assembler 
language programs. Included are a set of simplified mnemonics and symbols that define the 
simple shorthand used for the most frequently-used forms of branch conditional, compare, 
trap, rotate and shift, and certain other instructions. 

NOTE: The architecture specification refers to simplified mnemonics as extended 
mnemonics. 

F.1 Symbols 

The symbols in Table F-l are defined for use in instructions (basic or simplified 
mnemonics) that specify a condition register (CR) field or a bit in the CR. 


Table F-1. Condition Register Bit and Identification Symbol Descriptions 


Symbol 

Value 

Bit Field 
Range 

Description 

It 

0 

— 

Less than. Identifies a bit number within a CR field. 

gt 

1 

— 

Greater than. Identifies a bit number within a CR field. 

eq 

2 

— 

Equal. Identifies a bit number within a CR field. 

so 

3 

— 

Summary overflow. Identifies a bit number within a CR field. 

un 

3 

— 

Unordered (after floating-point comparison). Identifies a bit number in a CR field. 

crO 

0 

0-3 

CRO field 

crl 

1 

4-7 

CR1 field 

cr2 

2 

8-11 

CR2 field 

cr3 

3 

12-15 

CR3 field 

cr4 

4 

16-19 

CR4 field 

cr5 

5 

20-23 

CR5 field 

cr6 

6 

24-27 

CR6 field 

cr7 

7 

28-31 

CR7 field 


Note: To identify a CR bit, an expression in which a CR field symbol is multiplied by 4 and then added to a bit-number- 
within-CR-field symbol can be used. 
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The simplified mnemonics in Section F.5.2, “Basic Branch Mnemonics,” and Section F.6, 
“Simplified Mnemonics for Condition Register Logical Instructions,” require identification 
of a CR bit — if one of the CR field symbols is used, it must be multiplied by 4 and added 
to a bit-number- within-CR-field (value in the range of 0-3, explicit or symbolic). 

The simplified mnemonics in Section F.5.3, “Branch Mnemonics Incorporating 
Conditions,” and Section F.3, “Simplified Mnemonics for Compare Instructions,” require 
identification of a CR field — if one of the CR field symbols is used, it must not be multiplied 
by 4. 

Also, for the simplified mnemonics in Section F.5.3, “Branch Mnemonics Incorporating 
Conditions,” the bit number within the CR field is part of the simplified mnemonic. The CR 
field is identified, and the assembler does the multiplication and addition required to 
produce a CR bit number for the BI field of the underlying basic mnemonic. 

F.2 Simplified Mnemonics for Subtract Instructions 

This section discusses simplified mnemonics for the subtract instructions. 

F.2.1 Subtract Immediate 

Although there is no subtract immediate instruction, its effect can be achieved by using an 
add immediate instruction with the immediate operand negated. Simplified mnemonics are 
provided that include this negation, making the intent of the computation more clear. 

subi rD,rA, value (equivalent to addi rD,rA, -value) 

subis rD,rA,value (equivalent to addis rD,rA,-value) 

subic rD,rA, value (equivalent to addic rD,rA, -value) 

subic. rD,rA,value (equivalent to addic. rD,rA,-value) 

F.2. 2 Subtract 

The subtract from instructions subtract the second operand (rA) from the third (rB). 
Simplified mnemonics are provided that use the more normal order in which the third 
operand is subtracted from the second. Both these mnemonics can be coded with an o suffix 
and/or dot (.) suffix to cause the OE and/or Re bit to be set in the underlying instruction. 

sub rD,rA,rB (equivalent to subf rD,rB,rA) 

subc rD,rA,rB (equivalent to subfc rD,rB,rA) 
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F.3 Simplified Mnemonics for Compare Instructions 

The crfD field can be omitted if the result of the comparison is to be placed into the CRO 
field. Otherwise, the target CR field must be specified as the first operand. One of the CR 
field symbols defined in Section F.l, “Symbols,” can be used for this operand. 

NOTE: The basic compare mnemonics of PowerPC are the same as those of POWER, 
but the POWER instructions have three operands whereas the PowerPC 
instructions have four. 

The assembler recognizes a basic compare mnemonic with the three operands as 
the POWER form, and generates the instruction with L = 0. The crfD field can 
normally be omitted when the CRO field is the target. 

F.3.1 Word Comparisons 

The instructions listed in Table F-2 are simplified mnemonics that should be supported by 
assemblers for all PowerPC implementations. 


Table F-2. Simplified Mnemonics for Word Compare Instructions 


Operation 

Simplified Mnemonic 

Equivalent to: 

Compare Word Immediate 

cmpwi crfD, rA, SIMM 

cmpi crfD, 0,rA, SIMM 

Compare Word 

cmpw crfD,rA,rB 

cmp crfD,0,rA,rB 

Compare Logical Word Immediate 

cmplwi crfD,rA,UIMM 

cmpli crfD,0,rA,UIMM 

Compare Logical Word 

cmplw crfD,rA,rB 

cmpi crfD,0,rA,rB 


Following are examples using the word compare mnemonics. 

1 . Compare rA with immediate value 100 as signed 32-bit integers and place result in 
CRO. 

cmpwi rA,100 (equivalent to cmpi 0,0,rA,100) 

2. Same as (1), but place results in CR4. 

cmpwi cr4,rA,100 (equivalent to cmpi 4,0,rA,100) 

3. Compare rA and rB as unsigned 32-bit integers and place result in CRO. 

cmplw rA,rB (equivalent to cmpi 0,0,rA,rB) 
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F.4 Simplified Mnemonics for Rotate and Shift 
Instructions 

The rotate and shift instructions provide powerful and general ways to manipulate register 
contents, but can be difficult to understand. Simplified mnemonics that allow some of the 
simpler operations to be coded easily are provided for the following types of operations: 

• Extract — Select a field of n bits starting at bit position b in the source register; left 
or right justify this field in the target register; clear all other bits of the target register. 

• Insert — Select a left-justified or right-justified field of n bits in the source register; 
insert this field starting at bit position b of the target register; leave other bits of the 
target register unchanged. (No simplified mnemonic is provided for insertion of a 
left-justified field, when operating on double words, because such an insertion 
requires more than one instruction.) 

• Rotate — Rotate the contents of a register right or left n bits without masking. 

• Shift — Shift the contents of a register right or left n bits, clearing vacated bits 
(logical shift). 

• Clear — Clear the leftmost or rightmost n bits of a register. 

• Clear left and shift left — Clear the leftmost b bits of a register, then shift the register 
left by n bits. This operation can be used to scale a (known non-negative) array index 
by the width of an element. 
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F.4.1 Operations on Words 

The operations shown in Table F-3 are available in all implementations. All these 
mnemonics can be coded with a dot (.) suffix to cause the Re bit to be set in the underlying 
instruction. 


Table F-3. Word Rotate and Shift Instructions 


Operation 

Simplified Mnemonic 

Equivalent to: 

Extract and left justify immediate 

extlwi rA,rS,n,d (n > 0) 

rlwinm rA,rS,d,0,n- 1 

Extract and right justify immediate 

extrwi rA,rS,n,£> (n > 0) 

rlwinm rA,rS,b + n, 32 - n,31 

Insert from left immediate 

inslwi rA,rS ,n,b (n > 0) 

rlwimi rA,rS,32 - b,b,(b + n) - 1 

Insert from right immediate 

insrwi rA,rS,n,b (n > 0) 

rlwimi rA,rS,32 - (b + n),b,(b + n) - 1 

Rotate left immediate 

rotlwi rA,rS,n 

rlwinm rA,rS,n,0,31 

Rotate right immediate 

rotrwi rA,rS,n 

rlwinm rA,rS,32 - n,0,31 

Rotate left 

rotlw rA,rS,rB 

rlwnm rA,rS,rB,0,31 

Shift left immediate 

slwi rA,rS,n (n < 32) 

rlwinm rA,rS,n,0,31 - n 

Shift right immediate 

srwi rA,rS,n (n < 32) 

rlwinm rA,rS,32 - n,n,31 

Clear left immediate 

clrlwi rA,rS,n (n < 32) 

rlwinm rA,rS,0,n,31 

Clear right immediate 

clrrwi rA,rS,n (n < 32) 

rlwinm rA,rS,0,0,31 - n 

Clear left and shift left immediate 

clrlslwi rA,rS ,b,n (n < b < 31 ) 

rlwinm rA,rS,n,b- n,31 - n 


Examples using word mnemonics follow: 

1 . Extract the sign bit (bit 0) of rS and place the result right-justified into rA. 

extrwi rA,rS,l,0 (equivalent to rlwinm rA,rS,l,31,31) 

2. Insert the bit extracted in (1) into the sign bit (bit 0) of rB. 

insrwi rB,rA,l,0 (equivalent to rlwimi rB,rA,31,0,0) 

3. Shift the contents of rA left 8 bits. 

slwi rA,rA,8 (equivalent to rlwinm rA,rA,8,0,23) 

4. Clear the high-order 16 bits of rS and place the result into rA. 

clrlwi rA,rS,16 (equivalent to rlwinm rA,rS, 0 , 16 , 31 ) 

F.5 Simplified Mnemonics for Branch Instructions 

Mnemonics are provided so that branch conditional instructions can be coded with the 
condition as part of the instruction mnemonic rather than as a numeric operand. Some of 
these are shown as examples with the branch instructions. 

The mnemonics discussed in this section are variations of the branch conditional 
instructions. 
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F.5.1 BO and Bl Fields 

The 5-bit BO field in branch conditional instructions encodes the following operations. 

• Decrement count register (CTR) 

• Test CTR equal to zero 

• Test CTR not equal to zero 

• Test condition true 

• Test condition false 

• Branch prediction (taken, fall through) 

The 5-bit BI field in branch conditional instructions specifies which of the 32 bits in the CR 
represents the condition to test. 

To provide a simplified mnemonic for every possible combination of BO and BI fields 
would require 2 10 = 1024 mnemonics and most of these would be only marginally useful. 
The abbreviated set found in Section F.5.2, “Basic Branch Mnemonics,” is intended to 
cover the most useful cases. Unusual cases can be coded using a basic branch conditional 
mnemonic (be, bclr, beetr) with the condition to be tested specified as a numeric operand. 

F.5.2 Basic Branch Mnemonics 

The mnemonics in Table F-4 allow all the common BO operand encodings to be specified 
as part of the mnemonic, along with the absolute address (AA), and set link register (LR) 
bits. 

Notice that there are no simplified mnemonics for relative and absolute unconditional 
branches. For these, the basic mnemonics b, ba, bl, and bla are used. 

Table F-4 provides the abbreviated set of simplified mnemonics for the most commonly 
performed conditional branches. 


F-6 


PowerPC Microprocessor Family: The Programming Environments 


Table F-4. Simplified Branch Mnemonics 


Branch Semantics 

LR Update Not Enabled 

LR Update Enabled 

be 

Relative 

bca 

Absolute 

bclr 

to LR 

beetr 

to CTR 

bcl 

Relative 

bcla 

Absolute 

bclrl 

to LR 

bcctrl 

to CTR 

Branch unconditionally 

— 

— 

blr 

betr 

— 

— 

blrl 

bctrl 

Branch if condition true 

bt 

bta 

btlr 

btetr 

btl 

btla 

btlrl 

btctrl 

Branch if condition 
false 

bf 

bfa 

bflr 

bfetr 

bfl 

bfla 

bflrl 

bfctrl 

Decrement CTR, 
branch if CTR non-zero 

bdnz 

bdnza 

bdnzlr 

— 

bdnzl 

bdnzla 

bdnzlrl 

— 

Decrement CTR, 
branch if CTR non-zero 
AND condition true 

bdnzt 

bdnzta 

bdnztlr 


bdnztl 

bdnztla 

bdnztlrl 


Decrement CTR, 
branch if CTR non-zero 
AND condition false 

bdnzf 

bdnzfa 

bdnzflr 


bdnzfl 

bdnzfla 

bdnzflrl 


Decrement CTR, 
branch if CTR zero 

bdz 

bdza 

bdzlr 

— 

bdzl 

bdzla 

bdzlrl 

— 

Decrement CTR, 
branch if CTR zero 

AND condition true 

bdzt 

bdzta 

bdztlr 


bdztl 

bdztla 

bdztlrl 


Decrement CTR, 
branch if CTR zero 

AND condition false 

bdzf 

bdzfa 

bdzflr 


bdzfl 

bdzfla 

bdzflrl 



The simplified mnemonics shown in Table F-4 that test a condition require a corresponding 
CR bit as the first operand of the instruction. The symbols defined in Section F.l, 
“Symbols,” can be used in the operand in place of a numeric value. 


The simplified mnemonics found in Table F-4 are used in the following examples: 

1 . Decrement CTR and branch if it is still nonzero (closure of a loop controlled by a 
count loaded into CTR). 

bdnz target (equivalent to be 16,0, target) 

2. Same as (1) but branch only if CTR is non-zero and condition in CRO is “equal.” 

bdnzt eq, target (equivalent to be 8, 2 , target) 

3. Same as (2), but “equal” condition is in CR5. 

bdnzt 4 * cr5 + eq, target (equivalent to be 8, 22, target) 

4. Branch if bit 27 of CR is false. 

bf 27, target (equivalent to be 4,27, target) 

5. Same as (4), but set the link register. This is a form of conditional call, 

bfl 27, target (equivalent to bcl 4, 27, target) 
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Table F-5 provides the simplified mnemonics for the be and bca instructions without fink 
register updating, and the syntax associated with these instructions. 


NOTE: The default condition register specified by the simplified mnemonics in the table 
is CRO. 

Table F-5. Simplified Branch Mnemonics for be and bca Instructions without Link 

Register Update 


Branch Semantics 

LR Update Not Enabled 

be 

Relative 

Simplified 

Mnemonic 

bca 

Absolute 

Simplified 

Mnemonic 

Branch unconditionally 

— 

— 

— 

— 

Branch if condition true 

be 12,0, target 

bt 0, target 

bca 12,0, target 

bta 0, target 

Branch if condition false 

be 4,0, target 

bf 0, target 

bca 4,0, target 

bfa 0, target 

Decrement CTR, branch if CTR nonzero 

bcl 6,0, target 

bdnz target 

bca 16,0, target 

bdnza target 

Decrement CTR, branch if CTR nonzero 
AND condition true 

be 8,0, target 

bdnzt 0, target 

bca 8,0, target 

bdnzta 0, target 

Decrement CTR, branch if CTR nonzero 
AND condition false 

be 0,0, target 

bdnzf 0, target 

bca 0,0, target 

bdnzfa 0, target 

Decrement CTR, branch if CTR zero 

bcl 8,0, target 

bdz target 

bca 18,0, target 

bdza target 

Decrement CTR, branch if CTR zero 

AND condition true 

bcl 0,0, target 

bdzt 0, target 

bca 10,0, target 

bdzta 0, target 


Decrement CTR, branch if CTR zero 
AND condition false 


be 2,0, target 


bdzf 0, target 


bca 2,0, target 


bdzfa 0, target 




















































Table F-6 provides the simplified mnemonics for the bclr and bcclr instructions without 
fink register updating, and the syntax associated with these instructions. 


NOTE: The default condition register specified by the simplified mnemonics in the table 
is CRO. 


Table F-6. Simplified Branch Mnemonics for bclr and bcclr Instructions without 

Link Register Update 


Branch Semantics 

LR Update Not Enabled 

bclr 

to LR 

Simplified 

Mnemonic 

bcctr to CTR 

Simplified 

Mnemonic 

Branch unconditionally 

bclr 20,0 

blr 

bcctr 20,0 

bctr 

Branch if condition true 

bclr 12,0 

btlrO 

bcctr 12,0 

btctr 0 

Branch if condition false 

bclr 4,0 

bflrO 

bcctr 4,0 

bfctr 0 

Decrement CTR, branch if CTR 
nonzero 

bclr 16,0 

bdnzlr 

— 

— 

Decrement CTR, branch if CTR 
nonzero AND condition true 

bclr 10,0 

bdztlr 0 

— 

— 

Decrement CTR, branch if CTR 
nonzero AND condition false 

bclr 0,0 

bdnzflr 0 

— 

— 

Decrement CTR, branch if CTR 
zero 

bclr 18,0 

bdzlr 

— 

— 

Decrement CTR, branch if CTR 
zero AND condition true 

bclr 10,0 

bdztlr 0 

— 

— 

Decrement CTR, branch if CTR 
zero AND condition false 

bcctr 0,0 

bdzflr 0 

— 

— 











































Table F-7 provides the simplified mnemonics for the bcl and bcla instructions with fink 
register updating, and the syntax associated with these instructions. 


NOTE: The default condition register specified by the simplified mnemonics in the table 
is CRO. 


Table F-7. Simplified Branch Mnemonics for bcl and bcla Instructions with Link 

Register Update 


Branch Semantics 

LR Update Enabled 

bcl Relative 

Simplified 

Mnemonic 

bcla Absolute 

Simplified 

Mnemonic 

Branch unconditionally 

— 

— 

— 

— 

Branch if condition true 

bell 2,0, target 

btl 0, target 

bcla 12,0, target 

btla 0, target 

Branch if condition false 

bcl 4,0, target 

bfl 0, target 

bcla 4,0, target 

bfla 0, target 

Decrement CTR, branch if CTR 
nonzero 

bcl 16,0, target 

bdnzl target 

bcla 16,0, target 

bdnzla target 

Decrement CTR, branch if CTR 
nonzero AND condition true 

bcl 8,0, target 

bdnztl 0, target 

bcla 8,0, target 

bdnztla 0, target 

Decrement CTR, branch if CTR 
nonzero AND condition false 

bcl 0,0, target 

bdnzfl 0, target 

bcla 0,0, target 

bdnzfla 0, target 

Decrement CTR, branch if CTR 
zero 

bcl 18,0, target 

bdzl target 

bcla 18,0, target 

bdzla target 

Decrement CTR, branch if CTR 
zero AND condition true 

bcl 10,0, target 

bdztl 0, target 

bcla 10,0, target 

bdztla 0, target 

Decrement CTR, branch if CTR 
zero AND condition false 

bcl 2,0, target 

bdzfl 0, target 

bcla 2,0, target 

bdzfla 0, target 




















































Table F-8 provides the simplified mnemonics for the bclrl and bcctrl instructions with link 
register updating, and the syntax associated with these instructions. 


NOTE: The default condition register specified by the simplified mnemonics in the table 
is CRO. 


Table F-8. Simplified Branch Mnemonics for bclrl and bcctrl Instructions with Link 

Register Update 


Branch Semantics 

LR Update Enabled 

bclrl 

to LR 

Simplified 

Mnemonic 

bcctrl 

to CTR 

Simplified 

Mnemonic 

Branch unconditionally 

bclrl 20,0 

blrl 

bcctrl 20,0 

bctrl 

Branch if condition true 

bclrll 2,0 

btlrl 0 

bcctrl 12,0 

btctrl 0 

Branch if condition false 

bclrl 4,0 

bflrl 0 

bcctrl 4,0 

bfctrl 0 

Decrement CTR, branch if CTR 
nonzero 

bclrl 16,0 

bdnzlrl 

— 

— 

Decrement CTR, branch if CTR 
nonzero AND condition true 

bclrl 8,0 

bdnztlrl 0 

— 

— 

Decrement CTR, branch if CTR 
nonzero AND condition false 

bclrl 0,0 

bdnzflrl 0 

— 

— 

Decrement CTR, branch if CTR zero 

bclrl 18,0 

bdzlrl 

— 

— 

Decrement CTR, branch if CTR zero 
AND condition true 

bdztlrl 0 

bdztlrl 0 

— 

— 

Decrement CTR, branch if CTR zero 
AND condition false 

bclrl 4,0 

bflrl 0 

— 

— 












































F.5.3 Branch Mnemonics Incorporating Conditions 

The mnemonics defined in Table F-4 are variations of the branch if condition true and 
branch if condition false BO encodings, with the most useful values of BI represented in 
the mnemonic rather than specified as a numeric operand. 

A standard set of codes (shown in Table F-9) has been adopted for the most common 
combinations of branch conditions. 

Table F-9. Standard Coding for Branch Conditions 


Code 

Description 

It 

Less than 

le 

Less than or equal 

eq 

Equal 

ge 

Greater than or equal 

gt 

Greater than 

nl 

Not less than 

ne 

Not equal 

ng 

Not greater than 

so 

Summary overflow 

ns 

Not summary overflow 

un 

Unordered (after floating-point comparison) 

nu 

Not unordered (after floating-point comparison) 
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Table F-10 shows the simplified branch mnemonics incorporating conditions. 


Table F-10. Simplified Branch Mnemonics with Comparison Conditions 


Branch Semantics 

LR Update Not Enabled 

LR Update Enabled 

be 

Relative 

bca 

Absolute 

bclr 

to LR 

beetr 

to CTR 

bcl 

Relative 

bcla 

Absolute 

bclrl 

to LR 

bcctrl 

to CTR 

Branch if less than 

bit 

blta 

bltlr 

bltctr 

bltl 

bltla 

bltlrl 

bltctrl 

Branch if less than or 
equal 

ble 

blea 

blelr 

blectr 

blel 

blela 

blelrl 

blectrl 

Branch if equal 

beq 

beqa 

beqlr 

beqetr 

beql 

beqla 

beqlrl 

beqctrl 

Branch if greater than 
or equal 

bge 

bgea 

bgelr 

bgectr 

bgel 

bgela 

bgelrl 

bgectrl 

Branch if greater than 

bgt 

bgta 

bgtlr 

bgtetr 

bgtl 

bgtla 

bgtlrl 

bgtctrl 

Branch if not less than 

bnl 

bnla 

bnllr 

bnlctr 

bnll 

bnlla 

bnllrl 

bnlctrl 

Branch if not equal 

bne 

bnea 

bnelr 

bnectr 

bnel 

bnela 

bnelrl 

bnectrl 

Branch if not greater 
than 

bng 

bnga 

bnglr 

bngetr 

bngl 

bngla 

bnglrl 

bngctrl 

Branch if summary 
overflow 

bso 

bsoa 

bsolr 

bsoctr 

bsol 

bsola 

bsolrl 

bsoctrl 

Branch if not summary 
overflow 

bns 

bnsa 

bnslr 

bnsetr 

bnsl 

bnsla 

bnslrl 

bnsctrl 

Branch if unordered 

bun 

buna 

bunlr 

bunctr 

bunl 

bunla 

bunlrl 

bunctrl 

Branch if not unordered 

bnu 

bnua 

bnulr 

bnuctr 

bnul 

bnula 

bnulrl 

bnuctrl 


Instructions using the mnemonics in Table F-10 specify the condition register field in an 
optional first operand. If the CR field being tested is CRO, this operand need not be 
specified. One of the CR field symbols defined in Section F.l, “Symbols,” can be used for 
this operand. 

The simplified mnemonics found in Table F-10 are used in the following examples: 

1 . Branch if CRO reflects condition “not equal.” 

bne target (equivalent to be 4, 2, target) 

2. Same as (1) but condition is in CR3. 

bne cr3, target (equivalent to be 4, 14, target) 

3 . Branch to an absolute target if CR4 specifies “greater than,” setting the fink register. 
This is a form of conditional “call.” 

bgtla cr4, target (equivalent to bcla 12, 17, target) 

4. Same as (3), but target address is in the CTR. 

bgtctrl cr4 (equivalent to bcctrl 12,17) 
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Table F-l 1 shows the simplified branch mnemonics for the be and bca instructions without 
link register updating, and the syntax associated with these instructions. 

NOTE: The default condition register specified by the simplified mnemonics in the table 
is CRO. 

Table F-11. Simplified Branch Mnemonics for be and bca Instructions without 
Comparison Conditions and Link Register Updating 


Branch Semantics 

LR Update Not Enabled 

be Relative 

Simplified 

Mnemonic 

bca Absolute 

Simplified 

Mnemonic 

Branch if less than 

be 12,0, target 

bit target 

bca 12,0, target 

blta target 

Branch if less than or equal 

be 4,1, target 

ble target 

bca 4,1, target 

blea target 

Branch if equal 

be 12 , 2 , target 

beq target 

bca 12 , 2 , target 

beqa target 

Branch if greater than or equal 

be 4,0, target 

bge target 

bca 4,0, target 

bgea target 

Branch if greater than 

be 12,1, target 

bgt target 

bca 12,1, target 

bgta target 

Branch if not less than 

be 4,0, target 

bnl target 

bca 4,0, target 

bnla target 

Branch if not equal 

be 4, 2, target 

bne target 

bca 4, 2, target 

bnea target 

Branch if not greater than 

be 4,1 .target 

bng target 

bca 4,1, target 

bnga target 

Branch if summary overflow 

be 12 , 3, target 

bso target 

bca 12 , 3, target 

bsoa target 

Branch if not summary overflow 

be 4, 3, target 

bns target 

bca 4, 3, target 

bnsa target 

Branch if unordered 

be 12 , 3, target 

bun target 

bca 12 , 3, target 

buna target 

Branch if not unordered 

be 4, 3, target 

bnu target 

bca 4, 3, target 

bnua target 







































































Table F-12 shows the simplified branch mnemonics for the bclr and bcctr instructions 
without link register updating, and the syntax associated with these instructions. 

NOTE: The default condition register specified by the simplified mnemonics in the table 
is CRO. 

Table F-12. Simplified Branch Mnemonics for bclr and bcctr Instructions without 
Comparison Conditions and Link Register Updating 



LR Update Not Enabled 

Branch Semantics 

bclr to LR 

Simplified 

Mnemonic 

bcctr to CTR 

Simplified 

Mnemonic 

Branch if less than 

bclr 12,0 

bltlr 

bcctr 12,0 

bltctr 

Branch if less than or equal 

bclr 4,1 

blelr 

bcctr 4,1 

blectr 

Branch if equal 

bclr 12,2 

beqlr 

bcctr 12,2 

beqctr 

Branch if greater than or equal 

bclr 4,0 

bgelr 

bcctr 4,0 

bgectr 

Branch if greater than 

bclr 12,1 

bgtlr 

bcctr 12,1 

bgtctr 

Branch if not less than 

bclr 4,0 

bnllr 

bcctr 4,0 

bnlctr 

Branch if not equal 

bclr 4,2 

bnelr 

bcctr 4,2 

bnectr 

Branch if not greater than 

bclr 4,1 

bnglr 

bcctr 4,1 

bngctr 

Branch if summary overflow 

bclr 12,3 

bsolr 

bcctr 12,3 

bsoctr 

Branch if not summary overflow 

bclr 4,3 

bnslr 

bcctr 4,3 

bnsctr 

Branch if unordered 

bclr 12,3 

bunlr 

bcctr 12,3 

bunctr 

Branch if not unordered 

bclr 4,3 

bnulr 

bcctr 4,3 

bnuctr 







































































Table F-13 shows the simplified branch mnemonics for the bcl and bcla instructions with 
link register updating, and the syntax associated with these instructions. 

NOTE: The default condition register specified by the simplified mnemonics in the table 
is CRO. 

Table F-13. Simplified Branch Mnemonics for bcl and bcla Instructions with 
Comparison Conditions and Link Register Update 


Branch Semantics 

LR Update Enabled 

bcl Relative 

Simplified 

Mnemonic 

bcla Absolute 

Simplified 

Mnemonic 

Branch if less than 

bcl 12,0, target 

bltl target 

bcla 12,0, target 

bltla target 

Branch if less than or equal 

bcl 4,1 , target 

blel target 

bcla 4,1, target 

blela target 

Branch if equal 

beql target 

beql target 

bcla 12 , 2 , target 

beqla target 

Branch if greater than or equal 

bcl 4,0, target 

bgel target 

bcla 4,0, target 

bgela target 

Branch if greater than 

bcl 12,1, target 

bgtl target 

bcla 12,1, target 

bgtla target 

Branch if not less than 

bcl 4,0, target 

bnll target 

bcla 4,0, target 

bnlla target 

Branch if not equal 

bcl 4, 2, target 

bnel target 

bcla 4, 2, target 

bnela target 

Branch if not greater than 

bcl 4,1 , target 

bngl target 

bcla 4,1, target 

bngla target 

Branch if summary overflow 

bcl 12 , 3, target 

bsol target 

bcla 12 , 3, target 

bsola target 

Branch if not summary overflow 

bcl 4, 3, target 

bnsl target 

bcla 4, 3, target 

bnsla target 

Branch if unordered 

bcl 12 , 3, target 

bunl target 

bcla 12 , 3, target 

bunla target 

Branch if not unordered 

bcl 4, 3, target 

bnul target 

bcla 4, 3, target 

bnula target 







































































Table F-14 shows the simplified branch mnemonics for the bclrl and bcctl instructions with 
fink register updating, and the syntax associated with these instructions. 

NOTE: The default condition register specified by the simplified mnemonics in the table 
is CRO. 

Table F-14. Simplified Branch Mnemonics for bclrl and bcctl Instructions with 
Comparison Conditions and Link Register Update 


Branch Semantics 

LR Update Enabled 

bclrl to LR 

Simplified 

Mnemonic 

bcctrl to CTR 

Simplified 

Mnemonic 

Branch if less than 

bclrl 12,0 

bltlrl 0 

bcctrl 12,0 

bltctrl 0 

Branch if less than or equal 

bclrl 4,1 

blelrl 0 

bcctrl 4,1 

blectrl 0 

Branch if equal 

bclrl 12,2 

beqlrl 0 

bcctrl 12,2 

beqctrl 0 

Branch if greater than or equal 

bclrl 4,0 

bgelrl 0 

bcctrl 4,0 

bgectrl 0 

Branch if greater than 

bclrl 12,1 

bgtlrl 0 

bcctrl 12,1 

bgtctrl 0 

Branch if not less than 

bclrl 4,0 

bnllrl 0 

bcctrl 4,0 

bnlctrl 0 

Branch if not equal 

bclrl 4,2 

bnelrl 0 

bcctrl 4,2 

bnectrl 0 

Branch if not greater than 

bclrl 4,1 

bnglrl 0 

bcctrl 4,1 

bngctrl 0 

Branch if summary overflow 

bclrl 12,3 

bsolrl 0 

bcctrl 12,3 

bsoctrl 0 

Branch if not summary overflow 

bclrl 4,3 

bnslrl 0 

bcctrl 4,3 

bnsctrl 0 

Branch if unordered 

bclrl 12,3 

bunlrl 0 

bcctrl 12,3 

bunctrl 0 

Branch if not unordered 

bclrl 4,3 

bnulrl 0 

bcctrl 4,3 

bnuctrl 0 


F.5.4 Branch Prediction 

In branch conditional instructions that are not always taken, the low-order bit (y bit) of the 
BO field provides a hint about whether the branch is likely to be taken. See Section 4.2.4.2, 
“Conditional Branch Control,” for more information on the y bit. 

Assemblers should clear this bit unless otherwise directed. This default action indicates the 
following: 

• A branch conditional with a negative displacement field is predicted to be taken. 

• A branch conditional with a non-negative displacement field is predicted not to be 
taken (fall through). 

• A branch conditional to an address in the LR or CTR is predicted not to be taken (fall 
through). 
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If the likely outcome (branch or fall through) of a given branch conditional instruction is 
known, a suffix can be added to the mnemonic that tells the assembler how to set the y bit. 
That is, ‘+’ indicates that the branch is to be taken and indicates that the branch is not 
to be taken. Such a suffix can be added to any branch conditional mnemonic, either basic 
or simplified. 

For relative and absolute branches (bc[l][a]), the setting of the y bit depends on whether the 
displacement field is negative or non-negative. For negative displacement fields, coding the 
suffix V causes the bit to be cleared, and coding the suffix causes the bit to be set. For 
non-negative displacement fields, coding the suffix V causes the bit to be set, and coding 
the suffix causes the bit to be cleared. 

For branches to an address in the LR or CTR (bcclr[l] or bcctr[l]), coding the suffix V 
causes the y bit to be set, and coding the suffix causes the bit to be cleared. 

Examples of branch prediction follow: 

1 . Branch if CRO reflects condition “less than,” specifying that the branch should be 
predicted to be taken. 

blt+ target 

2. Same as (1), but target address is in the LR and the branch should be predicted not 
to be taken. 

bltlr- 

F.6 Simplified Mnemonics for Condition Register 
Logical Instructions 

The condition register logical instructions, shown in Table F-15, can be used to set, clear, 
copy, or invert a given condition register bit. Simplified mnemonics are provided that allow 
these operations to be coded easily. 

NOTE: The symbols defined in Section F.l, “Symbols,” can be used to identify the 
condition register bit. 


Table F-15. Condition Register Logical Mnemonics 


Operation 

Simplified Mnemonic 

Equivalent to 

Condition register set 

crset bx 

creqv bx,bx,bx 

Condition register clear 

crclr bx 

crxor bx,bx,bx 

Condition register move 

crmove bx,by 

cror bx,by,by 

Condition register not 

crnot bx,by 

crnor bx,by,by 
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Examples using the condition register logical mnemonics follow: 

1. Set CR bit 25. 

crset 25 (equivalent to creqv 25,25,25) 

2. Clear the SO bit of CRO. 

crclr so (equivalent to crxor 3,3,3) 

3. Same as (2), but SO bit to be cleared is in CR3. 

crclr 4 * cr3 + so (equivalent to crxor 15,15,15) 

4. Invert the EQ bit. 

crnot eq,eq (equivalent to crnor 2,2,2) 

5. Same as (4), but EQ bit to be inverted is in CR4, and the result is to be placed into 
the EQ bit of CR5. 

crnot 4 * cr5 + eq, 4 * cr4 + eq (equivalent to crnor 22,18,18) 

F.7 Simplified Mnemonics for Trap Instructions 

A standard set of codes, shown in Table F- 16, has been adopted for the most common 
combinations of trap conditions. 


Table F-16. Standard Codes for Trap Instructions 


Code 

Description 

TO Encoding 

< 

> 

II 

<u 

>u 

It 

Less than 

16 

1 

0 

0 

0 

0 

le 

Less than or equal 

20 

1 

0 

1 

0 

0 

eq 

Equal 

4 

0 

0 

1 

0 

0 

ge 

Greater than or equal 

12 

0 

1 

1 

0 

0 

gt 

Greater than 

8 

0 

1 

0 

0 

0 

nl 

Not less than 

12 

0 

1 

1 

0 

0 

ne 

Not equal 

24 

1 

1 

0 

0 

0 

ng 

Not greater than 

20 

1 

0 

1 

0 

0 

■ 

Logically less than 

2 

0 

0 

0 

1 

0 

lie 

Logically less than or equal 

6 

0 

0 

1 

1 

0 

Ige 

Logically greater than or equal 

5 

0 

0 

1 

0 

1 

igt 

Logically greater than 

1 

0 

0 

0 

0 

1 


Logically not less than 

5 

0 

0 

1 

0 

1 

Ing 

Logically not greater than 

6 

0 

0 

1 

1 

0 

— 

Unconditional 

31 

1 

1 

1 

1 

1 


Note: The symbol “<U” indicates an unsigned less than evaluation will be performed. The symbol “>U” indi- 
cates an unsigned greater than evaluation will be performed. 
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The mnemonics defined in Table F-17 are variations of trap instructions, with the most 
useful values of TO represented in the mnemonic rather than specified as a numeric 
operand. 


Table F-17. Trap Mnemonics 


Trap Semantics 

32-Bit Comparison 

twi Immediate 

tw Register 

Trap unconditionally 

— 

trap 

Trap if less than 

twlti 

twit 

Trap if less than or equal 

twlei 

twle 

Trap if equal 

tweqi 

tweq 

Trap if greater than or equal 

twgei 

twge 

Trap if greater than 

twgti 

twgt 

Trap if not less than 

twnli 

twnl 

Trap if not equal 

twnei 

twne 

Trap if not greater than 

twngi 

twng 

Trap if logically less than 

twllti 

twllt 

Trap if logically less than or equal 

twllei 

twlle 

Trap if logically greater than or equal 

twlgei 

twlge 

Trap if logically greater than 

twlgti 

twlgt 

Trap if logically not less than 

twlnli 

twlnl 

Trap if logically not greater than 

twlngi 

twlng 


Examples of the uses of trap mnemonics, shown in Table F-17, follow: 

1 . Trap if register rA is not zero. 

twnei rA,0 (equivalent to twi 24,rA,0) 

2. Trap if register rA is not equal to rB. 

twne rA, rB (equivalent to tw 24,rA,rB) 

3. Trap if rA is logically greater than 0x7FF. 

twlgti rA, 0x7FF (equivalent to twi l,rA, 0x7FF) 

4. Trap unconditionally. 

trap (equivalent to tw 31,0,0) 

Trap instructions evaluate a trap condition as follows: 

• The contents of register rA are compared with either the sign-extended SIMM field 
or the contents of register rB, depending on the trap instruction. 
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The comparison results in five conditions which are ANDed with operand TO. If the result 
is not 0, the trap exception handler is invoked. 

NOTE: Exceptions are referred to as interrupts in the architecture specification. See 
Table F-18 for these conditions. 

Table F-18. TO Operand Bit Encoding 


TO Bit 

ANDed with Condition 

0 

Less than, using signed comparison 

1 

Greater than, using signed comparison 

2 

Equal 

3 

Less than, using unsigned comparison 

4 

Greater than, using unsigned comparison 


F.8 Simplified Mnemonics for Special-Purpose 
Registers 

The mtspr and mfspr instructions specify a special-purpose register (SPR) as a numeric 
operand. Simplified mnemonics are provided that represent the SPR in the mnemonic rather 
than requiring it to be coded as a numeric operand. Table F-19 provides a list of the 
simplified mnemonics that should be provided by assemblers for SPR operations. 


Table F-19. Simplified Mnemonics for SPRs 


Special-Purpose Register 

Move to SPR 

Move from SPR 

Simplified 

Mnemonic 

Equivalent to 

Simplified 

Mnemonic 

Equivalent to 

XER 

mtxer rS 

mtspr 1 ,rS 

mfxer rD 

mfspr rD,1 

Link register 

mtlr rS 

mtspr 8,rS 

mflr rD 

mfspr rD,8 

Count register 

mtctr rS 

mtspr 9,rS 

mfctr rD 

mfspr rD,9 

DSISR 

mtdsisr rS 

mtspr 18,rS 

mfdsisr rD 

mfspr rD,18 

Data address register 

mtdar rS 

mtspr 19,rS 

mfdar rD 

mfspr rD,19 

Decrementer 

mtdec rS 

mtspr 22, rS 

mfdec rD 

mfspr rD,22 

SDR1 

mtsdrl rS 

mtspr 25, rS 

mfsdrl rD 

mfspr rD,25 

Save and restore register 0 

mtsrrO rS 

mtspr 26, rS 

mfsrrO rD 

mfspr rD,26 

Save and restore register 1 

mtsrrl rS 

mtspr 27, rS 

mfsrrl rD 

mfspr rD,27 

SPRG0-SPRG3 

mtspr n, rS 

mtspr 272 + n,rS 

mfsprg rD, n 

mfspr rD,272 + n 

External access register 

mtear rS 

mtspr 282, rS 

mfear rD 

mfspr rD,282 
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Table F-19. Simplified Mnemonics for SPRs (Continued) 


Special-Purpose Register 

Move to SPR 

Move from SPR 

Simplified 

Mnemonic 

Equivalent to 

Simplified 

Mnemonic 

Equivalent to 

Time base lower 

mttbl rS 

mtspr 284, rS 

mftb rD 

mftb rD,268 

Time base upper 

mttbu rS 

mtspr 285, rS 

mftbu rD 

mftb rD,269 

Processor version register 

— 

— 

mfpvr rD 

mfspr rD,287 

IBAT register, upper 

mtibatu n, rS 

mtspr 528 + (2 * n),rS 

mfibatu rD, n 

mfspr rD,528 + (2 * n) 

IBAT register, lower 

mtibatl n, rS 

mtspr 529 + (2 * n),rS 

mfibatl rD, n 

mfspr rD,529 + (2 * n) 

DBAT register, upper 

mtdbatu n, rS 

mtspr 536 + (2 *n),rS 

mfdbatu rD, n 

mfspr rD,536 + (2 *n) 

DBAT register, lower 

mtdbatl n, rS 

mtspr 537 + (2 * n),rS 

mfdbatl rD, n 

mfspr rD,537 + (2 * n) 


Following are examples using the SPR simplified mnemonics found in Table F-19: 

1 . Copy the contents of rS to the XER. 

mtxer rS (equivalent to mtspr l,rS) 

2. Copy the contents of the LR to rS. 

mflr rS (equivalent to mfspr rS,8) 

3. Copy the contents of rS to the CTR. 

mtctr rS (equivalent to mtspr 9,rS) 

F.9 Recommended Simplified Mnemonics 

This section describes some of the most commonly-used operations (such as no-op, load 
immediate, load address, move register, and complement register). 

F.9.1 No-Op (nop) 

Many PowerPC instructions can be coded in a way that, effectively, no operation is 
performed. An additional mnemonic is provided for the preferred form of no-op. If an 
implementation performs any type of run-time optimization related to no-ops, the preferred 
form is the no-op that triggers the following: 

nop (equivalent to ori 0,0,0) 

F.9. 2 Load Immediate (li) 

The addi and addis instructions can be used to load an immediate value into a register. 
Additional mnemonics are provided to convey the idea that no addition is being performed 
but that data is being moved from the immediate operand of the instruction to a register. 

1 . Load a 16-bit signed immediate value into rD. 

li rD, value (equivalent to addi rD,0, value) 
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2. Load a 16-bit signed immediate value, shifted left by 16 bits, into rD. 
lis rD, value (equivalent to addis rD,0, value) 

F.9.3 Load Address (la) 

This mnemonic permits computing the value of a base-displacement operand, using the 
addi instruction which normally requires a separate register and immediate operands. 

la rD,d(rA) (equivalent to addi rD,rA,d) 

The la mnemonic is useful for obtaining the address of a variable specified by name, 
allowing the assembler to supply the base register number and compute the displacement. 
If the variable v is located at offset dv bytes from the address in register rv, and the 
assembler has been told to use register ri/as a base for references to the data structure 
containing v ; the following line causes the address of i/to be loaded into register rD: 

larD,!/ (equivalent to addi rD,ri/,dv 

F.9.4 Move Register (mr) 

Several PowerPC instructions can be coded to copy the contents of one register to another. 
A simplified mnemonic is provided that signifies that no computation is being performed, 
but merely that data is being moved from one register to another. 

The following instruction copies the contents of rS into rA. This mnemonic can be coded 
with a dot (.) suffix to cause the Re bit to be set in the underlying instruction. 

mr rA,rS (equivalent to or rA,rS,rS) 

F.9.5 Complement Register (not) 

Several PowerPC instructions can be coded in a way that they complement the contents of 
one register and place the result into another register. A simplified mnemonic is provided 
that allows this operation to be coded easily. 

The following instruction complements the contents of rS and places the result into rA. 
This mnemonic can be coded with a dot (.) suffix to cause the Re bit to be set in the 
underlying instruction. 

not rA,rS (equivalent to nor rA,rS,rS) 

F.9.6 Move to Condition Register (mtcr) 

This mnemonic permits copying the contents of a GPR to the condition register, using the 
same syntax as the mfcr instruction. 

mtcr rS (equivalent to mtcrf OxFF,rS) 
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Glossary of Terms and Abbreviations 

The glossary contains an alphabetical list of terms, phrases, and abbreviations used in this 
book. Some of the terms and definitions included in the glossary are reprinted from IEEE 
Std. 754-1985, IEEE Standard for Binary Floating-Point Arithmetic, copyright ©1985 by 
the Institute of Electrical and Electronics Engineers, Inc. with the permission of the IEEE. 

NOTE: Some terms are defined in the context of how they are used in this book. 



Architecture. A detailed specification of requirements for a processor or 
computer system. It does not specify details of how the processor or 
computer system must be implemented; instead it provides a 
template for a family of compatible implementations. 

Asynchronous exception. Exceptions that are caused by events external to 
the processor’s execution. Additionally, this exception is not 
associated with any of the instructions currently in execution. In this 
document, the term ‘asynchronous exception’ is used 
interchangeably with the word interrupt. 

Atomic access. A bus access that attempts to be part of a read- write operation 
to the same address uninterrupted by any other access to that address 
(the term refers to the fact that the transactions are indivisible). The 
PowerPC architecture implements atomic accesses through the 
lwarx/stwcx. instruction pair. 



BAT (block address translation) mechanism. A software-controlled array 
that stores the available block address translations on-chip. 


Biased exponent. An exponent whose range of values is shifted by a constant 
(bias). Typically a bias is provided to allow a range of positive values 
to express a range that includes both positive and negative values. 


GLO 


Big-endian. A byte-ordering method in memory where the address n of a 
word corresponds to the most-significant byte. In an addressed 
memory word, the bytes are ordered (left to right) 0, 1,2, 3, with 0 
being the most- significant byte. See Little-endian. 
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Block. An area of memory that ranges from 128 Kbyte to 256 Mbyte, whose 
size, address translation, and protection attributes are controlled by 
the BAT mechanism. 

Boundedly undefined. A characteristic of results of certain operations that 
are not rigidly prescribed by the PowerPC architecture. Boundedly- 
undefined results for a given operation may vary among 
implementations, and between execution attempts in the same 
implementation. 

If a sequence of one or more instructions is executed in a manner not 
prescribed by the architecture or in a mode, method or context not 
specified by the architecture the resulting error conditions may not 
be known or defined but of course are finite. Therefore the term 
boundedly undefined is used to defined the unknown state of the 
machine. 



GLO 


Cache. High-speed memory component containing recently-accessed data 
and/or instructions (subset of main memory). 

Cache block. A small region of contiguous memory that is transferred 
between cache and memory. The size of a cache block may vary 
among processors; the maximum block size is one page. In PowerPC 
processors, cache coherency is maintained on a cache-block basis. 
Note: The term ‘cache block’ is often used interchangeably with 
‘cache line’. 

Cache coherency. An attribute wherein an accurate and common view of 
memory is provided to all devices that share the same memory 
system. Caches are coherent if a processor performing a read from 
its cache is supplied with data corresponding to the most recent value 
written to memory or to another processor’s cache. 

Cache flush. An operation that removes from a cache a block(s) of data from 
a specified address range. This operation ensures that any modified 
data within the specified address range is written back to main 
memory. This operation is generated typically by a Data Cache 
Block Flush (dcbf) instruction. 

Caching-inhibited. A memory update policy in which the cache is bypassed 
and the load or store is performed from or to main memory. 

Cast-outs. Cache blocks that must be removed from the cache when a cache 
miss causes a cache block to be replaced. The block being replaced 
in the cache is written to memory if it has been modified, (see MESI) 
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Changed bit. One of two page history bits found in each page table entry 
(PTE). The processor sets the changed bit if any store is performed 
into the page. See also Page access history bits and Referenced bit. 

Clear. To cause a bit or bit field to record a value of zero. See also Set. 

Context synchronization. An operation that ensures that all instructions in 
execution complete past the point where they can produce an 
exception , that all instructions in execution complete in the context 
in which they began execution, and that all subsequent instructions 
are fetched and executed in the same or new context. Context 
synchronization may result from executing specific instructions 
(such as isync or rfi) or when certain events occur (such as an 
exception). 

Copy-back. An operation in which modified data in a cache block is copied 
back to memory. A mode in which store instructions place data into 
the cache and rely upon cast-out, cache-flush or cache-block- store 
instructions to move the modified data to memory. 


D Denormalized number. A nonzero floating-point number whose exponent 

has a zero value, and whose implicit bit is zero, (see also tiny 
number) 

Direct-mapped cache. A cache in which each main memory address can 
appear in only one location within the cache, operates more quickly 
when the memory request is a cache hit. 

Direct-store. Interface available on PowerPC processors only to support 
direct-store devices from the POWER architecture. When the T bit 
of a segment descriptor is set, the descriptor defines the region of 
memory that is to be used as a direct-store segment. 

Note: This facility is being phased out of the architecture and will 
not likely be supported in future devices. Therefore, software should 
not depend on it and new software should not use it. 



Effective address (EA). The 32-bit address specified for a load, store, or an 
instruction fetch. This address is then submitted to the MMU for 
translation to either a physical memory address or an I/O address. 
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Exception. A condition encountered by the processor that requires special, 
supervisor-level processing, (a.k.a. interrupts) 

Exception handler. A software routine that executes when an exception is 
taken. Normally, the exception handler reacts to the condition that 
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caused the exception, or performs some other meaningful task (that 
may include aborting the program that caused the exception). The 
address for each exception handler is identified by an exception 
vector offset defined by the architecture and a prefix selected via the 
MSR. 

Extended opcode. A secondary opcode field generally located in instruction 
bits 21-30, that further defines the instruction. All PowerPC 
instructions are one word in length. The most significant 6 bits of the 
instruction are the primary opcode, identifying the instruction. 
However, many PowerPC instructions have the same primary opcode 
and rely on the extended opcode to uniquely identify the instruction. 
See also Primary opcode. 

Execution synchronization. A mechanism by which all instructions in 
execution are architecturally complete before beginning execution 
(appearing to begin execution) of the next instruction. Similar to 
context synchronization but doesn't force the contents of the 
instruction buffers to be deleted and refetched. 

Exponent. In the binary representation of a floating-point number, the 
exponent is the component that normally specifies the position of the 
binary point of the represented number. See also Biased exponent. 
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Fetch. Retrieving instructions or data from either the cache or main memory 
and placing them into the instruction queue or GPR, respectively. 

Floating-point register (FPR). Any of the 32 registers in the floating-point 
register file. These registers provide the source operands and 
destination results for floating-point instructions. Load instructions 
move data from memory to FPRs and store instructions move data 
from FPRs to memory. The FPRs are 64 bits wide and record 
floating-point values in double-precision format. 

Fraction. In the binary representation of a floating-point number, the field of 
the significand that lies to the right of its implied binary point. 

Fully-associative. Addressing scheme where every storage location (every 
byte) can have any possible address. 


G General-purpose register (GPR). Any of the 32 registers in the general- 

purpose register file. These registers provide the source operands and 
destination results for all integer data manipulation instructions. 
Also, address operands for all instructions that require an address are 
found in GPRs. Integer load instructions move data from memory to 
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GPRs and integer store instructions move data from GPRs to 
memory. 

Guarded. The guarded attribute pertains to out-of-order execution. When a 
page is designated as guarded, instructions and data cannot be 
accessed out-of-order. 



Harvard architecture. An architectural model featuring separate caches for 
instruction and data. 

Hashing. An algorithm to generate an address which is used to help search 
for an item more quickly in a memory structure. In PowerPC hashing 
is used to locate a PTE in the page table. 



IEEE 754. A standard written by the Institute of Electrical and Electronics 
Engineers that defines operations and representations of binary 
floating-point arithmetic. 

Illegal instructions. Any instruction using an undefined operation code in 
the PowerPC architecture. 

Implementation. A particular processor that conforms to the PowerPC 
architecture, but may differ from other architecture-compliant 
implementations for example in design, feature set, and 
implementation of optional features. The PowerPC architecture has 
many different implementations. 

Implementation-dependent. An aspect of a feature in a processor’s design 
that is defined by a processor’s design specifications rather than by 
the PowerPC architecture. 


Implementation-specific. An aspect of a feature in a processor’s design that 
is not required by the PowerPC architecture, but for which the 
PowerPC architecture may provide concessions to ensure that 
processors that implement the feature do so consistently. 


Imprecise exception. A type of synchronous exception that is allowed not to 
adhere to the precise exception model (see Precise exception). The 
PowerPC architecture allows only floating-point exceptions to be 
handled imprecisely. 
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Inexact. Loss of accuracy in an arithmetic operation when the rounded result 
differs from the infinitely precise value with unbounded range. 


In-order. An aspect of an operation that adheres to a sequential model. An 
operation is said to be performed in-order if, at the time that it is 
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performed, it is known to be required by the sequential execution 
model. See Out-of-order. 

Instruction latency. The number of clock cycles between the execution of an 
instruction and when the results of that instruction are available to 
the next sequential instruction. 

Instruction parallelism. A feature of PowerPC processors that allows 
instructions to be processed in parallel. 

Interrupt. An asynchronous exception. On PowerPC processors, interrupts 
are a special case of exceptions. See also asynchronous exception. 

Invalid state. State of a cache entry that does not currently contain a valid 
copy of a cache block from memory. 


K Key bits. A set of key bits referred to as Ks and Kp in each segment register 

and each BAT register. The key bits determine whether supervisor or 
user programs can access a page within that segment or block. 

Kill. An operation that causes a cache block to be invalidated. 


L L2 cache. A cache between the LI cache and main memory. See Secondary 

cache. 

Least-significant bit (lsb). The bit of least value in an address, register, data 
element, or instruction encoding. A bit to the farthest right in a bit 
field. 

Least-significant byte (LSB). The byte of least value in an address, register, 
data element, or instruction encoding. A byte to the farthest right in 
a byte field. 

Little-endian. A byte-ordering method in memory where the address n of a 
word corresponds to the least-significant byte. In an addressed 
memory word, the bytes are ordered (left to right) 3, 2, 1, 0, with 3 
being the most-significant byte. See Big-endian. 


M MESI (modified/exclusive/shared/invalid). Cache coherency protocol used 

to manage caches on different devices that share a memory system. 
Note: The PowerPC architecture does not specify the 
implementation of a MESI protocol to ensure cache coherency. 
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Memory access ordering. The specific order in which the processor 
performs load and store memory accesses and the order in which 
those accesses complete. 

Memory-mapped accesses. Accesses whose addresses use the page or block 
address translation mechanisms provided by the MMU and that 
occur externally with the bus protocol defined for memory. 

Memory coherency. An aspect of caching in which it is ensured that an 
accurate view of memory is provided to all devices that share system 
memory. 

Memory consistency. Refers to agreement of levels of memory with respect 
to a single processor and system memory (for example, on-chip 
cache, secondary cache, and system memory) and between multiple 
processors and input/output devices. Regardless of where a data item 
is stored it is visible to all processors and devices. See coherency. 

Memory management unit (MMU). The functional unit that is capable of 
translating an effective (logical) address to a physical address, 
providing protection mechanisms, and defining caching methods. 

Microarchitecture. The hardware implementation details of a 
microprocessor’s design. Such details are not defined by the 
PowerPC architecture. 

Mnemonic. The abbreviated name of an instruction. 

Modified state. When a cache block is in the modified state, it has been 
modified by the processor since it was copied from memory. See 
MESI. 

Munging. A modification performed on the three low-order bits of an 
effective address that allows it to appear to the processor that 
individual aligned scalars are stored as little-endian values, when in 
fact it is stored in big-endian order, but at different byte addresses 
within double words. 

Note: Munging affects only the effective address and not the byte 
order; also that this term is not used in the PowerPC architecture 
document. 

Multiprocessing. The capability of software, especially operating systems, 
to support execution on more than one processor at the same time. 

Most-significant bit (msb). The highest-order bit in an address, registers, 
data element, or instruction encoding. The bit to the farthest left in a 
bit field. 
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Most-significant byte (MSB). The highest-order byte in an address, 
registers, data element, or instruction encoding. The byte to the 
farthest left in a byte field. 


N NaN. An abbreviation for ‘Not a Number’; a symbolic entity encoded in 

floating-point format. There are two types of NaN s — signaling NaNs 
(SNaNs) and quiet NaNs (QNaNs). 

No-op. No-operation. An operation that does not change anything in registers 
or generate any bus activity. 

Normalization. A process by which a floating-point value is manipulated 
such that it can be represented in the format for the appropriate 
precision (single- or double-precision). For a floating-point value to 
be representable in the single- or double-precision format, the 
leading implied bit must be a 1 and the exponent must be greater than 
zero. 
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OEA (operating environment architecture). The level of the architecture 
that describes PowerPC memory management model, supervisor- 
level registers, synchronization requirements, and the exception 
model. It also defines the time-base feature from a supervisor-level 
perspective. Implementations that conform to the PowerPC OEA 
also conform to the PowerPC UISA and VEA. 

Optional. A feature, such as an instruction, a register, or an exception, that is 
defined by the PowerPC architecture but not required to be 
implemented. 

Out-of-order. An aspect of an operation that allows it to be performed ahead 
of one that may have preceded it in the sequential model, for 
example, speculative operations. An operation is said to be 
performed out-of-order if, at the time that it is performed, it is not 
known to be required by the sequential execution model. See 
In-order. 

Out-of-order execution. A technique that allows instructions to be issued 
and completed in an order that differs from their sequence in the 
instruction stream. 

Overflow. An error condition that occurs during arithmetic operations when 
the result cannot be stored accurately in the destination register(s). 
For example, if two 32-bit numbers are multiplied, the result may not 
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be representable in 32 bits. In an integer add operation if the carry 
into the sign bit is not equal to the carry out of the sign bit the 
overflow is set. 



Page. A region in memory. The OEA defines a page as a 4-Kbyte area of 
memory, aligned on a 4-Kbyte boundary. 

Page access history bits. The changed and referenced bits in the PTE keep 
track of the access history within the page. The referenced bit is set 
by the MMU whenever the page is accessed for a read operation. The 
changed bit is set when the page is stored into. See Changed bit and 
Referenced bit. 

Page fault. A page fault is a condition that occurs when the processor 
attempts to access a virtual address that does not reside within a page 
currently resident in physical memory. On PowerPC processors, a 
page fault exception condition occurs when a matching, valid page 
table entry (PTE[V] = 1) cannot be located in the page table. 

Page table. A table in memory is comprised of page table entries , or PTEs. 
It is further organized into eight PTEs per PTEG (page table entry 
group). The number of PTEGs in the page table depends on the size 
of the page table (as specified in the SDR1 register). 

Page table entry (PTE). Data structures containing information used to 
translate virtual address to physical address on a 4-Kbyte page basis. 
A PTE consists of 8 bytes of information. 

Physical memory. The actual memory that can be accessed through the 
system’s memory bus. 

Pipelining. A technique that breaks operations, such as instruction 
processing or bus transactions, into smaller distinct stages or tenures 
(respectively) so that a subsequent operation can begin before the 
previous one has completed. 

Precise exceptions. A category of exception for which the instruction 
causing the exception can be precisely located. See Imprecise 
exceptions. 

Primary opcode. The most-significant 6 bits (bits 0-5) of the instruction 
encoding that identifies the instruction or instruction type. See 
Secondary opcode. 

Protection boundary. A boundary between protection domains. 
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Protection domain. A protection domain is a segment, a virtual page, a BAT 
area, or a range of unmapped effective addresses. It is defined only 
when the appropriate relocate bit in the MSR (IR or DR) is 1 . 



Quad word. A group of 16 contiguous locations starting at an address 
divisible by 16. 
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Quiet NaN. A type of NaN that can propagate through most arithmetic 
operations without signaling exceptions. A quiet NaN is used to 
represent the results of certain invalid operations, such as division by 
zero, invalid arithmetic operations on infinities or on NaNs, when 
invalid. See Signaling NaN. 

rA. The rA instruction field is used to specify a GPR to be used as a source 
or destination register. Generally, if the instruction requires an 
address as one of the input operands this register is used. 

rB. The rB instruction field is used to specify a GPR to be used as a source 
register. 

rD. The rD instruction field is used to specify a GPR to be used as a 
destination register. 

rS. The rS instruction field is used to specify a GPR to be used as a source 
register. 

Real address mode. An MMU mode when no address translation is 
performed and the effective address specified is the same as the 
physical address. The processor’s MMU is operating in real address 
mode if its ability to perform address translation has been disabled 
through the MSR registers IR and/or DR bits. 

Record bit. Bit 31 (or the Re bit) in the instruction encoding. When it is set, 
updates the condition register (CR) to reflect the result of the 
operation. 

Referenced bit. One of two page history bits found in each page table entry 
(PTE). The processor sets the referenced bit whenever the page is 
accessed for a read. See also Page access history bits. 

Register indirect addressing. A form of addressing that specifies one GPR 
that contains the address for the load or store. 

Register indirect with immediate index addressing. A form of addressing 
that specifies an immediate value to be added to the contents of a 
specified GPR to form the target address for the load or store. 
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Register indirect with index addressing. A form of addressing that specifies 
that the contents of two GPRs be added together to yield the target 
address for the load or store. 

Reservation. The processor establishes a reservation on a cache block of 
memory space when it executes a lwarx instruction to read a 
memory semaphore into a GPR when an atomic update of memory 
is necessary. 

Reserved field. In an instruction or register, a reserved field is one that is not 
assigned a function. A reserved field may be a single bit. The 
handling of reserved bits is implementation-dependent. In registers 
software is permitted to write any value to such a bit. A subsequent 
reading of the bit returns 0 if the value last written to the bit was 0 
and returns an undefined value (0 or 1) otherwise. 

RISC (reduced instruction set computing). An architecture characterized 
by fixed-length instructions with nonoverlapping functionality and 
by a separate set of load and store instructions that perform memory 
accesses. 



Scalability. The capability of an architecture to generate implementations 
specific for a wide range of purposes, and in particular 
implementations of significantly greater performance and/or 
functionality than at present, while maintaining compatibility with 
current implementations. 

Secondary cache. A cache memory that is typically larger and has a longer 
access time than the primary cache. A secondary cache may be 
shared by multiple devices. Also referred to as L2, or level-2, cache. 

Segment. A 256-Mbyte area of virtual memory that is the most basic memory 
space defined by the PowerPC architecture. Each segment is 
configured through a unique segment descriptor. 


Segment descriptors. Information used to generate the high-order bits of the 
virtual address plus three additional control bits. The segment 
descriptors reside in 16 on-chip segment registers. 
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Set (v). To write a nonzero value to a bit or bit field; the opposite of clear. The 
term ‘set’ may also be used to generally describe the updating of a 
bit or bit field. 


Set in). A subdivision of a cache. Cacheable data can be stored in a given 
location in any one of the sets, typically corresponding to its lower- 
order address bits. Because several memory locations can map to the 
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same location, cached data is typically placed in the set whose cache 
block corresponding to that address was used least recently. See Set- 
associative. 

Set-associative. Aspect of cache organization in which the cache space is 
divided into sections, called sets. The cache controller associates a 
particular main memory address with the contents of a particular set, 
or region, within the cache. 

Signaling NaN. A type of NaN that generates an invalid operation program 
exception when it is specified as arithmetic operands. See Quiet 
NaN. 

Significant!. The component of a binary floating-point number that consists 
of an explicit or implicit leading bit to the left of its implied binary 
point and a fraction field to the right. 

Simplified mnemonics. Assembler mnemonics that represent a more 
complex form of a common operation. 

Static branch prediction. Mechanism by which software (for example, 
compilers) can give a hint to the machine hardware about the 
direction a branch is likely to take. 

Sticky bit. A bit that when set must be cleared explicitly. 

Strong ordering. A memory access model that requires exclusive access to 
an address before making an update, to prevent another device from 
using stale data. 

Superscalar machine. A machine that can processes multiple instructions 
concurrently from a conventional linear instruction stream. 

Supervisor mode. The privileged operation state of a processor. In 
supervisor mode, software, typically the operating system, can 
access all control registers and can access the supervisor memory 
space, among other privileged operations. 

Synchronization. A process used to ensure that operations occur strictly in 
order. See Context synchronization and Execution synchronization. 

Synchronous exception. An exception that is generated by the execution of 
a particular instruction or instruction sequence. There are two 
meanings of this concept. 

Synchronous meaning “at the same time as other exceptions”. 
Exceptions that occur at the same time are processed in a specific 
order. For example if a machine check, an invalid instruction and a 
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decrementer exception occur at the same time, the machine check 
has priority over the invalid instruction, and invalid instruction has 
priority over the Decrementer exception. 

Synchronous meaning “at the same time as the instruction in 
execution causing the exception”. Exceptions that occur as the result 
of an instruction execution are called synchronous exceptions. There 
are many examples: The execution of an invalid instruction, the sc 
and trap instructions, alignment, privilege instruction in user or 
problem mode, etc. These are also called precise exceptions. 

System memory. The physical memory available to a processor. 



TLB (translation lookaside buffer) A cache that holds recently-used page 
table entries. 


Throughput. A measure of the number of instructions that are processed per 
unit of time. 

Tiny. A floating-point value that is too small to be represented as a 
normalized value. A floating-point number not equal to zero where 
the exponent is zero and the mantissa is none zero. 



UISA (user instruction set architecture). The level of the architecture to 
which user-level software should conform. The UISA defines the 
base user-level instruction set, user-level registers, data types, 
floating-point memory conventions and exception model as seen by 
user programs, and the memory and programming models. 


Underflow. An error condition that occurs during arithmetic operations when 
the result cannot be represented accurately in the destination register. 
For example, underflow can happen if two floating-point fractions 
are multiplied and the result requires a smaller exponent and/or 
mantissa than the single-precision format can provide. In other 
words, the result is too small to be represented accurately. 

Unified cache. Combined data and instruction cache. 


GLO 


User mode. The unprivileged operating state of a processor used typically by 
application software. In user mode, software can only access certain 
control registers and can access only user memory space. No 
privileged operations can be performed. Also referred to as problem 
state. 
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Virtual address. An intermediate address used in the translation of an 

effective address to a physical address. 

Virtual memory. The address space created using the memory management 
facilities of the processor. Program access to virtual memory is 
possible only when it’s page is resident in physical memory. 


VEA (virtual environment architecture). The level of the architecture that 
describes the memory model for an environment in which multiple 
devices can access memory, defines aspects of the cache model, 
defines cache control instructions, and defines the time-base facility 
from a user-level perspective. Implementations that conform to the 
PowerPC VEA also adhere to the UISA, but may not necessarily 
adhere to the OEA. 


w Weak ordering. A memory access model that allows bus operations to be 

reordered dynamically, which improves overall performance and in 
particular reduces the effect of memory latency on instruction 
throughput. 

Word. A 32-bit data element. 

Write-back. A cache memory update policy in which processor write cycles 
are directly written only to the cache. External memory is updated 
only indirectly, for example, when a modified cache block is cast out 
to make room for newer data. 

Write-through. A cache memory update policy in which all processor write 
cycles are written to both the cache and memory. 
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operand conventions 

architecture levels represented, 3-1 
biased exponent values, 3-19 
signifrcand value, 3-17 
tiny, definition, 3-18 
underflow/overflow, 3-16 
terminology, xxxv 
CR (condition register) 
bit fields, 2-5 

CR bit and identification symbols, F-l 
CR logical instructions, 4-51, A-21 
CR settings, 4-25, B-2 
CR0/CR1 field definitions, 2-6-2-6 
CR/; field, compare instructions, 2-7 
move to/from CR instructions, 4-53 
simplified mnemonics, F-18 
CR logical instructions, 4-51, A-21, F-18 
crand, 4-51, 8-34 
crandc, 4-51, 8-35 
creqv, 4-51, 8-36 
crnand, 4-51, 8-37 
crnor, 4-51,8-38 
cror, 4-51, 8-39 
crorc, 4-51, 8-40, 8-41 
crxor, 4-5 1 
CTR (count register) 

BO operand encodings, 2-13 

branch conditional to count register, 4-46, B-4 


D 

DABR (data address breakpoint register), 2-34, 6-24 
DAR (data address register) 
alignment exception register settings, 6-30 
description, 2-29 

DSI exception register settings, 6-26 
Data cache 
clearing bytes, B-7 
instructions, 5-8 

Data handling and precision, 3-24 
Data organization, memory, 3-1 
Data transfer 

aligned data transfer, 1-10, 3-1 
I/O data transfer addressing, LE mode, 3-11 
Data types 
aligned scalars, 3-6 
misaligned scalars, 3-9 
nonscalars, 3-10 
dcbf, 4-61, 5-10, 8-44 
dcbi, 4-67, 5-19, 8-45 
dcbst, 4-61,5-9, 8-46, 8-47 


debt, 4-59, 5-8 
debtst, 4-59, 5-8, 8-48 
debz, 4-60, 4-60, 5-9, 8-49, B-7 
DEC (decrementer register) 
decrementer operation, 2-33 
POWER and PowerPC, B-9 
writing and reading the DEC, 2-34 
Decrementer exception, 6-5, 6-9, 6-36 
Defined instruction class, 4-3 
Denormalization, definition, 3-23 
Denormalized numbers, 3-20 
Direct-store facility, see Direct-store segment 
Direct-store segment 
description, 7-67 
direct-store address translation 
definition, 7-6 

selection, 7-7, 7-13, 7-34, 7-67 
direct-store facility, 7-6 
I/O interface considerations, 5-19 
instructions not supported, 7-68 
integer alignment exception, 6-3 1 
key bit description, 7-10 
key/PP combinations, conditions, 7-44 
no-op instructions, 7-69 
protection, 7-10 
segment accesses, 7-68 
translation summary flow, 7-69 
divw, 4-14, 8-50 
divwu, 4-14, 8-51 
DSI exception 
description, 6-4 

partially executed instructions, 6-11, 6-23 
DSISR register 

settings for alignment exception, 6-30 
settings for DSI exception, 6-25 
settings for misaligned instruction, 6-32 


E 

EAR (external access register) 
bit format, 2-36 
eciwx, 4-63, 8-52 
ecowx, 4-63, 8-54, 8-56 
Effective address calculation 
address translation, 2-29, 7-1 
branches, 4-6, 4-41 
EA modifications, 3-7 
loads and stores, 4-6, 4-28, 4-36 
eieio, 4-58, 5-2 
eqv, 4-17, 8-58 
Exceptions 

alignment exception, 6-4, 6-28 
asynchronous exceptions, 6-3, 6-8 
classes of exceptions, 6-3, 6-12 
conditions for key/PP combinations, 7-44 
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context synchronizing exception, 2-36 
decrementer exception, 6-5, 6-9, 6-36 
DSI exception, 6-4, 6-11, 6-23 
enabling/disabling exceptions, 6-17 
exception classes, 6-3, 6-12 
exception conditions 
inexact, 3-43 
invalid operation, 3-37 
MMU exception conditions, 7-16 
overflow, 3-41 
overview, 6-4 

program exception conditions, 6-5, 6-34, 6-34 
recognizing/handling, 6-1 
underflow, 3-42 
zero divide, 3-38 
exception definitions, 6-20 
exception model, overview, 1-13 
exception priorities, 6-12 
exception processing 
description, 6-14 
stages, 6-2 
steps, 6-18 

exceptions, effects on FPSCR, B-6 
external interrupt, 6-4, 6-9, 6-27 
FP assist exception, 6-5, 6-40 
FP exceptions, B-8 

FP program exceptions, 3-28, 6-5, 6-34, 6-34 
FP unavailable exception, 6-5, 6-35 
IEEE FP enabled program exception 
condition, 6-5, 6-34 
illegal instruction program exception 
condition, 6-5, 6-34 
imprecise exceptions, 6-9 
instruction causing conditions, 4-9 
integer alignment exception, 6-3 1 
ISI exception, 6-4, 6-26 
LE mode alignment exception, 6-3 1 
machine check exception, 6-4, 6-8, 6-22 
MMU-related exceptions, 7-15 
overview, 1-13 
precise exceptions, 6-6 

privileged instruction type program exception 
condition, 6-5, 6-34 
program exception 
conditions, 6-5, 6-34, 6-34 
register settings 
FPSCR, 3-28 
MSR, 6-20 
SRR0/SRR1, 6-14 
reset exception, 6-4, 6-8, 6-21, 6-21 
return from exception handler, 6-19 
summary, 4-9, 6-4 

synchronous/precise exceptions, 6-3, 6-7 
system call exception, 6-5, 6-37 


terminology, 6-2 
trace exception, 6-5, 6-38 
translation exception conditions, 7-15 
trap program exception condition, 6-5, 6-35 
vector offset table, 6-4 
Exclusive OR (XOR), 3-6 
Execution model 
floating-point, 3-15 
IEEE operations, D-l 
in-order execution, 5-16 
multiply-add instructions, D-4 
out-of-order execution, 5-16 
sequential execution, 4-2 
Execution synchronization, 4-8, 6-7 
Extended mnemonics, see Simplified mnemonics 
Extended/primary opcodes, 4-3 
External control instructions, 

4-63, 8-52-8-54, ??-8-56, A-23 
External interrupt, 6-4, 6-9, 6-27 
extsb, 4-17, 8-59 
extsh, 4-17, 8-60 

F 

fabs, 4-28, 8-61 
fadd, 4-21, 8-62 
fadds, 4-21, 8-63 
fcmpo, 4-26, 8-64 
fcmpu, 4-26, 8-65 
fctiw, 4-25, 8-66 
fctiwz, 4-25, 8-67 
fdiv, 4-22, 8-68 
fdivs, 4-22, 8-69 
Floating-point model 
biased exponent format, 3-17 
binary FP numbers, 3-19 
data handling, 3-24 
denormailized numbers, 3-20 
execution model 
floating-point, 3-15 
IEEE operations, D-l 
multiply-add instructions, D-4 
FE0/FE1 bits, 2-22 

FP arithmetic instructions, 4-21, A- 16 
FP assist exceptions, 6-5 
FP compare instructions, 4-25, A-17 
FP data formats, 3-16 
FP execution model, 3-15 
FP load instructions, 4-38, A-19, D-15 
FP move instructions, 4-27, A-20 
FP multiply-add instructions, 4-23, A-16 
FP program exceptions 
description, 3-28, 6-34 
exception conditions, 6-5 
FE0/FE1 bits, 6-10 
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POWER/PowerPC, MSR bit 20, B-8 
FP rounding/conversion instructions, 4-24, A- 17 
FP store instructions, 4-40, A-20, B-7, D-17 
FP unavailable exception, 6-5, 6-35 
FPR0-FPR31, 2-4 
FPSCR instructions, 4-26, A-17 
IEEE floating-point fields, 3-17 
IEEE-754 compatibility, 1-10, 3-17 
infinities, 3-21 

models for FP instructions, D-6 
NaNs, 3-21 

normalization/denormalization, 3-23 
normalized numbers, 3-19 
precision handling, 3-24 
program exceptions, 3-28 
recognized FP numbers, 3-18 
rounding, 3-25 
sign of result, 3-22 

single-precision representation in FPR, 3-25 
value representation, FP model, 3-18 
zero values, 3-20 
Flow control instructions 
branch instruction address calculation, 4-41 
condition register logical, 4-51 
system linkage, 4-52, 4-64 
trap, 4-52 
frnadd, 4-23, 8-70 
frnadds, 4-23, 8-71,8-71 
fmr, 4-27, 8-72 
fmsub, 4-23, 8-73 
fmsubs, 4-23, 8-74 
frnul, 4-21, 8-75 
frnuls, 4-21, 8-76, 8-76 
fnabs, 4-28, 8-77 
fneg, 4-28, 8-78 
fnrnadd, 4-24, 8-79 
fnmadds, 4-24, 8-80, 8-80 
fnrnsub, 4-24, 8-81 
fnmsubs, 4-24, 8-82, 8-82 
FP assist exception, 6-40 
FP exceptions, 6-35, 6-40 
FPCC (floating-point condition code), 4-25 
FPECR (floating-point exception cause register), 2-32 
FPR0-FPR3 1 (floating-point registers), 2-4 
FPSCR (floating-point status and control register) 
bit settings, 2-8, 3-29 
FP result flags in FPSCR, 3-31 
FPCC, 4-25 

FPSCR instructions, 4-26, A-17 
FR and FI bits, effects of exceptions, B-6 
move from FPSCR, B-7 
RN field, 3-26 
fres, 4-22, 8-83 
frsp, 3-24, 4-25, 8-85 
frsqrte, 4-22, 8-86, 8-89, 8-90 


fsel, 4-22, 8-88, D-5 
fsqrt, 4-22 
fsqrts, 4-22 
fsub, 4-21, 8-91, 8-92 
fsubs, 4-21,8-92 

G 

GPR0-GPR31 (general purpose registers), 2-3 
Graphics instructions 
fres, 4-22, 8-83 
frsqrte, 4-22, 8-86, 8-89, 8-90 
fsel, 4-22, 8-88 
stfiwx, 4-41, 8-179 
Guarded attribute (G) 

G-bit operation, 5-7, 5-16 
guarded memory, 5-17 
out-of-order execution, 5-16 


H 

Harvard cache model, 5-5 
Hashed page tables, 7-48 
Hashing functions 
page table 

primary PTEG, 7-52, 7-59 
secondary PTEG, 7-52, 7-60 


I 

I/O data transfer addressing, FE mode, 3-1 1 
I/O interface considerations 
direct-store operations, 5-19 
memory-mapped I/O interface operations, 5-19 
icbi, 4-62, 5-11, 8-93 
IEEE 64-bit execution model, D-l 
IEEE FP enabled program exception 
condition, 6-5, 6-34 
Illegal instruction class, 4-5 
Illegal instruction program exception 
condition, 6-5, 6-34 
Imprecise exceptions, 6-9 
Inexact exception condition, 3-43 
In-order execution, 5-16 
Instruction addressing 
FE mode examples, 3-11 
Instruction cache instructions, 5-10 
Instruction restart, 3-14 
Instruction set conventions 
classes of instructions, 4-3 
computation modes, 4-3 
memory addressing, 4-6 
sequential execution model, 4-2 
Instructions 

64-bit bridge instructions 
optional instructions, 4-4 
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boundedly undefined, definition, 4-3 
branch instructions 
branch address calculation, 4-41 
branch conditional 

absolute addressing mode, 4-44 
CTR addressing mode, 4-46 
LR addressing mode, 4-45 
relative addressing mode, 4-42 
branch instructions, 4-50, A-20, F-5 
condition register logical, 4-5 1 
conditional branch control, 4-47 
description, 4-50, A-20 
effective address calculation, 4-41 
system linkage, 4-52, 4-64 
trap, 4-52 

cache management instructions 
dcbf, 4-61, 5-10, 8-44 
dcbi, 4-67, 5-19, 8-45 
dcbst, 4-61,5-9, 8-46, 8-47 
debt, 4-59, 5-8 
debtst, 4-59, 5-8, 8-48 
debz, 4-60, 4-60, 5-9, 8-49 
eieio, 4-58, 5-2 
iebi, 4-62,5-11,8-93 
isync, 4-58,5-12, 8-94 
list of instructions, 4-59, 4-67, A-22 
classes of instructions, 4-3 
condition register logical, 4-51, A-21 
conditional branch control, 4-47 
context-altering instructions, 2-36 
context-synchronizing instructions, 2-36, 4-7 
defined instruction class, 4-3 
execution synchronization, 3-35 
external control instructions, 4-4, 4-63, A-23 
floating-point 

arithmetic, 4-21, 8-68, A- 16 
compare, 4-25, 8-64, A-17, F-3 
computational instructions, 3-15 
FP conversions, D-5 
FP load instructions, 4-38, A-19, D-15 
FP move instructions, 4-27, A-20 
FP store instructions, A-20, B-7, D-17 
FPSCR instructions, 4-26, A-17 
models for FP instructions, D-6 
multiply-add, 4-23, A-16, D-4 
noncomputational instructions, 3-15 
rounding/con version, 4-24, ??-8-67, A-17 
flow control instructions 
branch address calculation, 4-41 
CR logical, 4-5 1 
system linkage, 4-52, 4-64 
trap, 4-52 

graphics instructions 
fres, 4-22, 8-83 


frsqrte, 4-22, 8-86, 8-89, 8-90 
fsel, 4-22, 8-88 
stfiwx, 4-41, 8-179 
illegal instruction class, 4-5 
instruction fetching 
branch/flow control instructions, 4-41 
direct-store segment, 7-15 
exception processing steps, 6-18 
exception synchronization steps, 6-6 
instruction cache instructions, 5-10 
integer store instructions, 4-33 
multiprocessor systems, 5-11 
precise exceptions, 6-6 
uniprocessor systems, 5-10 
instruction field conventions, xxxv 
instructions not supported, direct-store, 7-68 
integer 

arithmetic, 4-2, 4-10, A- 14 
compare, 4-14, A-15, F-3 
load, 4-31, A-17, A-17 
load/store multiple, 4-35, A-19, B-5 
load/store string, 4-36, A-19, B-5 
load/store with byte reverse, 4-34, A- 18 
logical, 4-2, 4-15, A-15 
rotate/shift, 4-17-4-19, A-15-A-16, F-4 
store, 4-33, A- 18 
invalid instruction forms, 4-4 
load and store 

address generation, floating-point, 4-36 
address generation, integer, 4-28 
byte reverse instructions, 4-34, A- 18 
floating-point load, 4-38, A-19 
floating-point move, 4-27, A-20 
floating-point store, 4-39, B-7 
integer load, 4-31, A-17, A-17 
integer store, 4-33, A-18 
memory synchronization, 4-54, 4-55, 4-57, A-19 
multiple instructions, 4-35, A-19, B-5 
string instructions, 4-36, A-19, B-5 
lookaside buffer management instructions, 

4-66, 4-68, A-23 

memory control instructions, 4-58, 4-66 
memory synchronization instructions 
eieio, 4-58, 5-2 
isync, 4-58,5-12, 8-94 
list of instructions, 4-55, 4-57, A-19 
lwarx, 4-55, 8-120 
stwex., 4-55, 8-194 
sync, 4-56, 5-3, 8-205, B-5 
new instructions 
mtmsrd, 7-64 
no-op, 4-4, F-22 
optional instructions, 4-4 
partially executed instructions, 6-11 
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POWER instructions 
deleted in PowerPC, B-9 
supported in PowerPC, B-l 1 
PowerPC instructions, list, A-l, A-8, A- 14 
preferred instruction forms, 4-4 
processor control 

instructions, 4-53, 4-56, 4-64, A-22 
reserved bits, POWER and PowerPC, B-2 
reserved instructions, 4-5 
segment register manipulation 
instructions, 4-67, A-23 
SLB management instructions, 4-68 
supervisor-level cache management 
instructions, 4-66 
supervisor-level instructions, 4-9 
system linkage instructions, 4-52, 4-64, A-21 
TLB management instructions, 4-68, A-23 
trap instructions, 4-52, A-21 
Integer alignment exception, 6-31 
Integer arithmetic instructions, 4-2, 4-10, A-14 
Integer compare instructions, 4-14, A-15, F-3 
Integer load instructions, 4-31, A-17, A-17 
Integer logical instructions, 4-2, 4-15, A-15 
Integer rotate and shift instructions, F-4 
Integer rotate/shift 

instructions, 4-17-4-19, A-15-A-16, F-4 
Integer store instructions 
description, 4-33 
instruction fetching, 4-33 
list. A- 18 

Interrupts, see Exceptions 

Invalid instruction forms, 4-4 

Invalid operation exception condition, 3-37 

ISI exception, 6-4, 6-26 

isync, 4-58, 5-12, 8-94 

K 

Key (Ks, Kp) protection bits, 7-42 

L 

lbz, 4-32, 8-95 
lbzu, 4-32, 8-96 
lbzux, 4-32, 8-97 
lbzx, 4-32, 8-98 
ldarx/stdcx. 

general information, 5-4, E-l 
lfd, 4-39, 8-99 
lfdu, 4-39, 8-100 
lfdux, 4-39, 8-101 
lfdx, 4-39, 8-102 
lfs, 4-39, 8-103 
lfsu, 4-39, 8-104 
lfsux, 4-39, 8-105 


lfsx, 4-39, 8-106 
lha, 4-32, 8-107 
lhau, 4-32, 8-108 
lhaux, 4-32, 8-109 
lhax, 4-32, 8-110 
lhbrx, 4-35,8-111 
lhz, 4-32, 8-112 
lhzu, 4-32, 8-113 
lhzux, 4-32, 8-114 
lhzx, 4-32, 8-115 
Little-endian mode 
alignment exception, 6-31 
byte ordering, 3-2, 3-6 
description, 3-2 

I/O data transfer addressing, 3-11 
instruction addressing, 3-10 
LE and ILE bits, 3-6 
mapping, 3-5 
misaligned scalars, 3-9 
munged structure S, 3-7-3-8 
LK bit, inappropriate use, B-3 
lmw, 4-36, 8-116, B-5 
Load/store 

address generation, floating-point, 4-37 
address generation, integer, 4-28 
byte reverse instructions, 4-34, A- 18 
floating-point load instructions, 4-38, A-19 
floating-point move instructions, 4-27, A-20 
floating-point store instructions, 4-39, A-20, B-7 
integer load instructions, 4-31, A-17, A-17 
integer store instructions, 4-33, A- 18 
load/store multiple instructions, 4-35, A-19, B-5 
memory synchronization instructions, 4-54, A-19 
string instructions, 4-36, A-19, B-5 
Logical addresses 

translation into physical addresses, 7-1 
Logical instructions, integer, 4-2, 4-15, A-15 
Lookaside buffer management 
instructions, 4-66, 4-68, A-23 
lswi, 4-36, 8-117, B-5 
lswx, 4-36, 8-118, B-5 
lwarx, 4-54, 4-55, 8-120 
lwarx/stwcx. 

general information, 5-4, E-l 
list insertion, E-6 
lwarx, 4-55, 8-120 
semaphores, 4-54 
stwcx., 4-55, 8-194 

synchronization primitive examples, E-2 
lwbrx, 4-35, 8-121 
lwz, 4-32, 8-122 
lwzu, 4-33, 8-123 
lwzux, 4-33,8-124 
lwzx, 4-33, 8-125 
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M 

Machine check exception 
causing conditions, 6-4, 6-8, 6-22 
non-recoverable, causes, 6-22 
register settings, 6-23 
mcrf, 4-51, 8-126 
mcrfs, 4-26, 8-127 
mcrxr, 4-53, 8-128 
Memory access 
ordering, 5-2 
update forms, B-4 
Memory addressing, 4-6 
Memory coherency 
coherency controls, 5-5 
coherency precautions, 5-7 
M-bit operation, 5-7, 5-7, 5-15 
memory access modes, 5-6 
sync instruction, 5-3 
Memory control instructions 
segment register manipulation, 4-67, A-23 
SLB management, 4-68 
supervisor-level cache management, 4-66 
TLB management, 4-68 
user-level cache, 4-58 
Memory management unit 
address translation flow, 7-11 
address translation mechanisms, 7-6, 7-10 
address translation types, 7-8 
block address translation, 7-7, 7-11, 7-19 
conceptual block diagram, 7-5 
direct-store address translation, 7-13, 7-67 
exceptions summary, 7-14 
hashing functions, 7-52 
instruction summary, 7-17 
memory addressing, 7-3 
memory protection, 7-8, 7-30, 7-42 
MMU exception conditions, 7-16 
MMU organization, 7-4 
MMU registers, 7-18 
MMU-related exceptions, 7-14 
overview, 1-13, 7-2 

page address translation, 7-6, 7-13, 7-46 
page history status, 7-10, 7-38, 7-40 
page table search operation, 7-48 
real addressing mode translation, 7-11, 7-18, 7-33 
register summary, 7-18 
segment model, 7-32 
Memory operands, 3-1, 4-6 
Memory segment model 
description, 7-32 
memory segment selection, 7-33 
page address translation 
overview, 7-34 
PTE definitions, 7-37 


summary, 7-46 
page history recording 
changed (C) bit, 7-40 
description, 7-38 
referenced (R) bit, 7-39 
table search operations, update history, 7-39 
page memory protection, 7-42 
recognition of addresses, 7-33 
referenced/changed bits 
changed (C) bit, 7-40 
guaranteed bit settings, model, 7-41 
recording scenarios, 7-40 
referenced (R) bit, 7-39 
synchronization of updates, 7-42 
table search operations, update history, 7-39 
updates to page tables, 7-63 
Memory synchronization 
eieio, 4-58, 5-2 
isync, 4-58, 5-12, 8-94 
list of instructions, 4-55, 4-57, A-19 
lwarx, 4-54, 4-55, 8-120 
stwcx., 4-54, 4-55, 8-194 
sync, 4-56, 5-3, 8-205, B-5 
Memory, data organization, 3-1 
Memory/cache access modes, see WIMG bits 
mfcr, 4-53, 8-129 
mffs, 4-26, 8-130 
mfmsr, 4-65, 8-131, B-l 
mfspr, 4-53,4-65, 8-132, B-6 
rnfsr (64-bit bridge), 4-68, B-l 
mfsrin (64-bit bridge), 4-68, 8-136 
mftb, 4-57, 8-137 
Migration to PowerPC, B-l 
Misaligned accesses and alignment, 3-1 
Mnemonics 

recommended mnemonics, F-22 
simplified mnemonics, F-l 
Move to/from CR instructions, 4-53 
MSR (machine state register) 
bit settings, 2-21 
EE bit, 6-17 

FE0/FE1 bits, 2-22, 6-10 
FE0/FE1 bits and FP exceptions, 3-34 
LE and ILE bits, 1-9, 3-6 
optional bits (SE and BE), 2-21 
RI bit, 6-19 

settings due to exception, 6-20 
mtcrf, 4-53, 8-139 
mtfsbO, 4-27, 8-140 
mtfsbl, 4-27, 8-141 
mtfsf, 4-27, 8-142 
mtfsfi, 4-27, 8-143 
mtmsr (64-bit bridge), 4-65, 8-144 
mtmsrd, 7-64 
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mtspr, 4-53, 4-65, 8-145, B-6 
mtsr (64-bit bridge), 4-68, 8-135, 8-148 
mtsrin (64-bit bridge), 4-68, 8-149 
mulhw, 4-14, 8-150 
mulhwu, 4-14, 8-151 
mulli, 4-13, 8-152 
mullw, 4-13, 8-153 
Multiple register loads, B-5 
Multiple-precision shift examples, C-l 
Multiply-add 
execution model, D-4 
instructions, floating-point, 4-23, A- 16 
Multiprocessor, usage, 5-1 
Munging 
description, 3-6 
LE mapping, 3-7-3-8 


N 

nand, 4-16, 8-154 

NaNs (Not a Numbers), 3-21 

neg, 4-13, 8-155 

No-execute protection, 7-8, 7-12 

Nonscalars, 3-10 

No-op, 4-4, F-22 

nor, 4-16, 8-156 

Normalization, definition, 3-23 

Normalized numbers, 3-19 

o 

OEA (operating environment architecture) 
cache model and memory coherency, 5-1 
definition, xxvi, 1-5 

general changes to the architecture, 1-16, 1-17 
implementing exceptions, 6- 1 
memory management specifications, 7-1 
programming model, 2-18 
register set, 2-17 
Opcodes, primary/extended, 4-3 
Operands 

BO operand encodings, 2-13, 4-47, B-3 
conventions, description, 1-9, 3-1 
memory operands, 4-6 
placement 

effect on performance, summary, 3-12 
instruction restart, 3-14 
Operating environment architecture, see OEA 
Optional instructions, 4-4, A-30 
or, 4-16, 8-157 
ore, 4-17, 8-158 
ori, 4-16, 8-159 
oris, 4-16, 8-160 
Out-of-order execution, 5-16 
Overflow exception condition, 3-41 


P 


Page address translation 
definition, 7-6 

integer alignment exception, 6-3 1 
overview, 7-34 

page address translation flow, 7-46 
page memory protection, 7-28, 7-42 
page size, 7-32 
page tables in memory, 7-48 
PTE definitions, 7-37 
segment descriptors, 7-33 
selection of page address translation, 7-6, 7-13 
summary, 7-46 
Page history status 

making R and C bit updates to page tables, 7-63 
R and C bit recording, 7-10, 7-38, 7-40 
R and C bit updates, 7-63 

Page memory protection, see Protection of memory 
areas 

Page tables 

allocation of PTEs, 7-56 
definition, 7-49 

example table structures, ?? — 7-58 
hashed page tables, 7-48 
hashing functions, 7-52, 7-60 
organized as PTEGs, 7-49 
page table size, 7-51 
page table structure summary, 7-56 
page table updates, 7-63 
PTEG addresses, 7-58 
table search flow, 7-62 
Page, definition, 5-6 
Performance 

effect of operand placement, summary, 3-12 
instruction restart, 3-14 
Physical address generation 
generation of PTEG addresses, 7-58 
memory management unit, 7-1 
Physical memory 
physical vs. virtual memory, 5-1 
predefined locations, 7-3 
PIR (processor identification register), 2-36 
POWER architecture 
AL bit in MSR, B-2 
alignment for load/store multiple, B-5 
branch conditional to CTR, B-4 
differences in implementations, B-4 
FP exceptions, B-8 
instructions 

dclz/dcbz instructions, differences, B-7 
deleted in PowerPC, B-9 
load/store multiple, alignment, B-5 
load/store string instructions, B-5 
move from FPSCR, B-7 
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move to/from SPR, B-6 
reserved bits, POWER and PowerPC, B-2 
SR instructions, differences from PowerPC, B-7 
supported in PowerPC, B-l 1 
svcx/sc instructions, differences, B-4 
memory access update forms, B-4 
migration to PowerPC, B-l 
POWER/PowerPC incompatibilities, B-l 
registers 
CR settings, B-2 
decrementer register, B-9 
multiple register loads, B-5 
reserved bits, POWER and PowerPC, B-2 
RTC (real-time clock), B-8 
synchronization, B-5 

timing facilities, POWER and PowerPC, B-8 
TLB entry invalidation, B-8 
PowerPC architecture 
alignment for load/store multiple, B-5 
byte ordering, 3-6 
cache model. Harvard, 5-5 
computation modes, 1-4, 4-3 
differences in implementations, B-4 
features summary 
defined features, 1-3, 1-6 
features not defined, 1-6 
I/O data transfer addressing, 3-11 
instruction addressing, 3-10 
instruction list, A-l, A-8, A-14 
instructions 

dcbz/dclz instructions, differences, B-7 
deleted in POWER, B-9 
load/store multiple, alignment, B-5 
load/store string instructions, B-5 
move from FPSCR, B-7 
move to/from SPR, B-6 
reserved bits, POWER and PowerPC, B-2 
SR instructions, differences from POWER, B-7 
supported in POWER, B-l 1 
svcx/sc instructions, differences, B-4 
levels of the PowerPC architecture, 1-4-1 -6 
memory access update forms, B-4 
operating environment architecture, xxvi, 1-5 
overview, 1-2 

POWER/PowerPC, incompatibilities, B-l 
registers 
CR settings, B-2 
decrementer register, B-9 
multiple register loads, B-5 
programming model, 1-7, 2-2, 2-14, 2-18 
reserved bits, POWER and PowerPC, B-2 
synchronization, B-5 

timing facilities, POWER and PowerPC, B-8 
TLB entry invalidation, B-8 


user instruction set architecture, xxv, 1-4 
virtual environment architecture, xxv, 1-4 
PP protection bits, 7-42 
Precise exceptions, 6-3, 6-6, 6-7 
Preferred instruction forms, 4-4 
Primary/extended opcodes, 4-3 
Priorities, exception, 6-12 
Privilege levels 

external control instructions, 4-63 
supervisor/user mode, 1-8 
supervisor-level cache control instruction, 4-66 
TBR encodings, 4-57 
user-level cache control instructions, 4-58 
Privileged instruction type program exception 
condition, 6-5, 6-34 
Privileged state, see Supervisor mode 
Problem state, see User mode 
Process switching, 6-19 

Processor control instructions, 4-53, 4-56, 4-64, A-22 
Program exception 
description, 3-28, 6-5, 6-34, 6-34 
five (5) program exception conditions, 6-5, 6-34 
move to/from SPR, B-6 
Programming model 
all registers (OEA), 2-18 
user-level plus time base (VEA), 2-14 
user-level registers (UISA), 2-2 
Protection of memory areas 
block access protection, 7-27, 7-28, 7-30, 7-42 
direct-store segment protection, 7-10, 7-68 
no-execute protection, 7-8, 7-12 
options available, 7-8, 7-42 
page access protection, 7-28, 7-30, 7-42 
programming protection bits, 7-42 
protection violations, 7-15, 7-30, 7-43 
PTEGs (PTE groups) 
definition, 7-49 

example primary and secondary PTEGs, 7-58 
PTEs (page table entries) 
adding a PTE, 7-64 
modifying a PTE, 7-65 
page table definition, 7-49 
page table updates, 7-63 
PTE bit definitions, 7-38 
PVR (processor version register), 2-23 

Q 

Quiet NaNs (QNaNs) 
description, 3-21 
representation, 3-22 


R 

Real address (RA), see Physical address generation 
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Real addressing mode address translation (translation 
disabled) 

data/instruction accesses, 7-11, 7-18, 7-33 
definition, 7-6 

Real numbers, approximation, 3-18 
Record bit (Re) 
description, 8-3 
inappropriate use, B-3 
Referenced (R) bit maintenance 
page history information, 7-10 
recording, 7-10, 7-38, 7-39, 7-40 
updates, 7-63 
Registers 

configuration registers 
MSR, 2-20 
PVR, 2-23 

exception handling registers 
DAR, 2-29 
DSISR, 2-30 
FPECR (optional), 2-32 
list, 2-19 

SPRG0-SPRG3, 2-30 
SRR0/SRR1, 2-31 
memory management registers 
BATs, 2-24 
list, 2-19 
SDR1, 2-27 
SRs, 2-28 

miscellaneous registers 
DABR (optional), 2-34 
DEC, 2-33 
EAR (optional), 2-35 
list, 2-20 

PIR (optional), 2-36 
TBL/TBU, 2-15 
MMU registers, 7-18 
multiple register loads, B-5 
OEA register set, 2-17 
optional registers 
DABR, 2-34 
EAR, 2-35 
FPECR, 2-32 
PIR, 2-36 

reserved bits, POWER and PowerPC, B-2 
supervisor-level 
BATs, 2-24, 7-25 
DABR, 6-24 
DABR (optional), 2-34 
DAR, 2-29 
DEC, 2-33, B-9 
DSISR, 2-30 
EAR (optional), 2-35 
FPECR (optional), 2-32 
MSR, 2-20 
PIR (optional), 2-36 


PVR, 2-23 
SDR1, 2-27 
SPRG0-SPRG3, 2-30 
SRR0/SRR1, 2-31 
SRs, 2-28 
TBL/TBU, 2-15 
UISA register set, 2-1 
user-level 
CR, 2-5 
CTR, 2-12 
FPR0-FPR31, 2-4 
FPSCR, 2-7 
GPR0-GPR31, 2-3 
LR, 2-12 
TBL/TBU, 2-32 
XER, 2-11, B-4 
VEA register set, 2-13 
Reserved instruction class, 4-5 
Reset exception, 6-4, 6-8, 6-21 
Return from exception handler, 6-19 
rfi (64-bit bridge), 4-64, 8-161 
rlwimi, 4-19, 8-162 
rlwinm, 4-18, 8-163 
rlwnm, 4-19, 8-165 

Rotate/shift instructions, 4-17-4-19, A-15-A-16, F-4 
Rounding, floating-point operations, 3-25 
Rounding/conversion instructions, FP, 4-24 
RTC (real time clock), B-8 


s 


sc 

differences in implementation, POWER and 
PowerPC, B-4 

for context synchronization, 4-7 
occurrence of system call exception, 6-37 
user-level function, 4-52, 4-64, 8-166 
Scalars 

aligned, LE mode, 3-6 
big-endian, 3-2 
description, 3-2 
little-endian, 3-2 
SDR1 register 
definitions, 7-50 
format, 7-50 

generation of PTEG addresses, 7-58 
Segment registers 
instructions 

32-bit implementations only, 7-36 
POWER/PowerPC, differences, B-7 
segment descriptor 
format, 7-35 

SR manipulation instructions, 4-67, 4-67, A-23 
T = 1 format (direct-store), 7-67 
T-bit, 2-28, 7-33 
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Segmented memory model, see Memory management 
unit 

Sequential execution model, 4-2 
Shift/rotate instructions, 4-17-4-19, A-15-A-16, F-4 
Signaling NaNs (SNaNs), 3-21 
Simplified mnemonics 
branch instructions, F-5 
compare instructions, F-3 
CR logical instructions, F-18 
recommended mnemonics, 4-56, F-22 
rotate and shift, F-4 
special-purpose registers (SPRs), F-21 
subtract instructions, F-2 
trap instructions, F-19 
SLB management instructions, 4-68 
slw, 4-19, 8-167 
SNaNs (signaling NaNs), 3-21 
Special-purpose registers (SPRs), F-21 
SPRG0-SPRG3, conventional uses, 2-30 
sraw, 4-20, 8-168 
srawi, 4-20, 8-169 

SRR0/SRR1 (status save/restore registers) 
format, 2-31, 2-31 

machine check exception, register settings, 6-23 
srw, 4-19, 8-170 
stb, 4-33, 8-171 
stbu, 4-33, 8-172 
stbux, 4-34, 8-173 
stbx, 4-33, 8-174 
stdcx./ldarx 

general information, 5-4, E-l 
stfd, 4-40, 8-175 
stfdu, 4-40, 8-176 
stfdux, 4-41, 8-177 
stfdx, 4-40, 8-178 
stfiwx, 4-41,8-179, D-17 
stfs, 4-40, 8-180 
stfsu, 4-40, 8-181 
stfsux, 4-40, 8-182 
stfsx, 4-40, 8-183 
sth, 4-34, 8-184 
sthbrx, 4-35, 8-185 
sthu, 4-34, 8-186 
sthux, 4-34, 8-187 
sthx, 4-34, 8-188 
stmw, 4-36, 8-189 
Structure mapping examples, 3-3 
stswi, 4-36, 8-190 
stswx, 4-36, 8-191 
stw, 4-34, 8-192 
stwbrx, 4-35, 8-193 
stwcx., 4-54, 4-55, 8-194 
stwcx./lwarx 

general information, 5-4, E-l 


lwarx, 4-55, 8-120 
semaphores, 4-54 
stwcx., 4-55, 8-194 

synchronization primitive examples, E-2 
stwu, 4-34, 8-196 
stwux, 4-34, 8-197 
stwx, 4-34, 8-198 
subf, 4-11,8-199 
subfc, 4-12, 8-200 
subfe, 4-12, 8-201 
subfic, 4-11,8-202 
subfme, 4-12, 8-203 
subfze, 4-13, 8-204 
Subtract instructions, F-2 
Supervisor mode, see Privilege levels 
sync, 4-56, 5-3, 8-205, B-5 
Synchronization 
compare and swap, E-4 

context/execution synchronization, 2-36, 4-7, 6-6 
context-altering instruction, 2-36 
context-synchronizing exception, 2-36 
context-synchronizing instruction, 2-36 
data access synchronization, 2-37 
execution of rfi, 6-19 
implementation-dependent 
requirements, 2-38, 2-39 
instruction access synchronization, 2-38 
list insertion, E-6 
lock acquisition and release, E-5 
memory synchronization instructions, 4-54, A- 19 
overview, 6-6 

requirements for lookaside buffers, 2-36 
requirements for special registers, 2-36 
rfi/rfid, 2-37 

synchronization primitives, E-2 
synchronization programming examples, E-l 
synchronizing instructions, 1-11, 2-37 
Synchronous exceptions 
causes, 6-3 
classifications, 6-3 
exception conditions, 6-7 
System call exception, 6-5, 6-37 
System IEEE FP enabled program exception 
condition, 6-5, 6-34 
System linkage instructions 
list of instructions, A-21 
rfi, 8-161 

sc, 4-52, 4-64, 8-166 
System reset exception, 6-4, 6-8, 6-21 

T 

Table search operations 
hashing functions, 7-52 
page table definition, 7-49 
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SDRl register, 7-50 

table search flow (primary and secondary), 7-62 
Terminology conventions, xxxv 
Time base 

computing time of day, 2-16 
reading the time base, 2-16 
TBL/TBU, 2-15 

timer facilities, POWER and PowerPC, B-8 
writing to the time base, 2-32 
Tiny values, definition, 3-18 
TLB invalidate 
TLB entry invalidation, B-8 
TLB invalidate broadcast operations, 7-18, 7-63 
TLB management instructions, A-23 
tlbie instruction, 7-18, 7-63 
TLB management instructions, 4-68 
tibia, 4-69 

tlbie, 4-69, 8-207, B-8 

tlbsync, 4-69, 8-208 

tlbsync instruction emulation, 7-63 

TO operand, L-21 

Trace exception, 6-5, 6-38 

Trap instructions, 4-52, L-19 

Trap program exception condition, 6-5, 6-35 

tw, 4-52, 8-209 

twi, 4-52, 8-210 

u 

UISA (user instruction set architecture) 
definition, xxv, 1-4 

general changes to the architecture, 1-15 
programming model, 2-2 
register set, 2-1 

Underflow exception condition, 3-42 
User instruction set architecture, see UISA 
User mode, see Privilege levels 
User-level registers, list, 2-2, 2-14 

V 

VEA (virtual environment architecture) 
cache model and memory coherency, 5-1 
definition, xxv, 1-4 

general changes to the architecture, 1-16, 1-16 
programming model, 2-14 
register set, 2-13 
time base, 2-15 

Vector offset table, exception, 6-4 
Virtual address 
formation, 2-29 

Virtual environment architecture, see VEA 
Virtual memory 
implementation, 7-2 
virtual vs. physical memory, 5-1 


w 

WIMG bits, 5-6, 7-64 
description, 5-13 
G-bit, 5-16 
in BAT register, 7-26 
in BAT registers, 2-25, 5-13 
WIM combinations, 5-15 
Write-back mode, 5-14 
Write-through attribute (W) 
write -through/write-back operation, 5-6, 5-14 


X 

XER register 
bit definitions, 2-11 

difference from POWER architecture, B-4 
xor, 4-16, 8-211 
XOR (exclusive OR), 3-6 
xori, 4-16, 8-212 
xoris, 4-16, 8-213 

z 

Zero divide exception condition, 3-38 
Zero numbers, format, 3-20 
Zero values, 3-20 
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